[PATCH 0/3] drm/v3d: add multiple in/out syncobjs support

List overview All Threads
Download

newer

older

[git pull] drm fixes for 5.15-rc2

[PATCH v2] drm/rockchip: Update...

Melissa Wen

18 Aug 2021 18 Aug '21

5:54 p.m.

Currently, v3d only supports single in/out syncobj per submission (in v3d_submit_cl, there are two in_sync, one for bin and another for render job); however, Vulkan queue submit operations expect multiples wait and signal semaphores. This series extends v3d interface and job dependency operations to handle more than one in/out syncobj.

The main difference from the RFC series[1] is that here I already worked on top of Daniel Vetter's work for drm/scheduler[2] (even if the series is missing a few acks).

The first patch just decouples the steps to lookup and add job dependency from the job init code, since the operation repeats for every syncobj that a job should wait before starting. So, the third patch of this series will reuse it to handle multiples wait for semaphores.

The second patch extends our interface by using a generic extension. This approach was inspired by i915_user_extension[3] and amd_cs_chunks[4] to give a little more flexibility in adding other submission features in the future. Therefore, the list of extensions would work as a hub of features that use an id to determine the corresponding feature data type.

With this base, the third patch adds multiple wait/signal semaphores support. For this, we add to the list of the generic extensions a new data type (drm_v3d_multi_sync) that points to two arrays of syncobjs (in/out) and also determines (flags) if the dependencies must be added to the bin job or render job (in the case of v3d_submit_cl). An auxiliary struct (v3d_submit_ext) is used when parsing submission extensions. Finally, we reserve some space in the semaphore struct (drm_v3d_sem) to accommodate timeline semaphores that we aim to add support soon (same reason for already defining v3d_submit_outsync).

[1] https://patchwork.freedesktop.org/series/93388/ [2] https://patchwork.freedesktop.org/series/93413/ [3] https://cgit.freedesktop.org/drm/drm-misc/commit/drivers/gpu/drm/i915/i915_u... [4] https://cgit.freedesktop.org/drm/drm-misc/tree/include/uapi/drm/amdgpu_drm.h...

In the mesa side, the work related to this series is available at https://gitlab.freedesktop.org/mwen/mesa/-/commit/77bd2b21f61a9caeced934bd13... where I checked these changes using Mesa CI.

Melissa Wen (3): drm/v3d: decouple adding job dependencies steps from job init drm/v3d: add generic ioctl extension drm/v3d: add multiple syncobjs support

drivers/gpu/drm/v3d/v3d_drv.c | 7 +- drivers/gpu/drm/v3d/v3d_drv.h | 14 ++ drivers/gpu/drm/v3d/v3d_gem.c | 303 ++++++++++++++++++++++++++++------ include/uapi/drm/v3d_drm.h | 76 ++++++++- 4 files changed, 350 insertions(+), 50 deletions(-)

-- 2.30.2

Attachments:

signature.asc (application/pgp-signature — 833 bytes)

Show replies by date

Melissa Wen

18 Aug 18 Aug

5:55 p.m.

New subject: [PATCH 1/3] drm/v3d: decouple adding job dependencies steps from job init

Prep work to enable a job to wait for more than one syncobj before start. Also get rid of old checkpatch warnings in the v3d_gem file. No functional changes.

Signed-off-by: Melissa Wen mwen@igalia.com --- drivers/gpu/drm/v3d/v3d_gem.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index a3529809d547..593ed2206d74 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -416,7 +416,7 @@ v3d_wait_bo_ioctl(struct drm_device *dev, void *data, return -EINVAL;

ret = drm_gem_dma_resv_wait(file_priv, args->handle, - true, timeout_jiffies); + true, timeout_jiffies);

/* Decrement the user's timeout, in case we got interrupted * such that the ioctl will be restarted. @@ -434,12 +434,25 @@ v3d_wait_bo_ioctl(struct drm_device *dev, void *data, return ret; }

+static int +v3d_job_add_deps(struct drm_file *file_priv, struct v3d_job *job, + u32 in_sync, u32 point) +{ + struct dma_fence *in_fence = NULL; + int ret; + + ret = drm_syncobj_find_fence(file_priv, in_sync, point, 0, &in_fence); + if (ret == -EINVAL) + return ret; + + return drm_sched_job_add_dependency(&job->base, in_fence); +} + static int v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, struct v3d_job *job, void (*free)(struct kref *ref), u32 in_sync, enum v3d_queue queue) { - struct dma_fence *in_fence = NULL; struct v3d_file_priv *v3d_priv = file_priv->driver_priv; int ret;

@@ -455,11 +468,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, if (ret) goto fail;

- ret = drm_syncobj_find_fence(file_priv, in_sync, 0, 0, &in_fence); - if (ret == -EINVAL) - goto fail_job; - - ret = drm_sched_job_add_dependency(&job->base, in_fence); + ret = v3d_job_add_deps(file_priv, job, in_sync, 0); if (ret) goto fail_job;

@@ -499,7 +508,7 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, for (i = 0; i < job->bo_count; i++) { /* XXX: Use shared fences for read-only objects. */ dma_resv_add_excl_fence(job->bo[i]->resv, - job->done_fence); + job->done_fence); }

drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx); @@ -904,8 +913,7 @@ v3d_gem_init(struct drm_device *dev) if (!v3d->pt) { drm_mm_takedown(&v3d->mm); dev_err(v3d->drm.dev, - "Failed to allocate page tables. " - "Please ensure you have CMA enabled.\n"); + "Failed to allocate page tables. Please ensure you have CMA enabled.\n"); return -ENOMEM; }

-- 2.30.2

Melissa Wen

5:56 p.m.

New subject: [PATCH 2/3] drm/v3d: add generic ioctl extension

Add support to attach generic extensions on job submission. This patch is a second prep work to enable multiple syncobjs on job submission. With this work, when the job submission interface needs to be extended to accomodate a new feature, we will use a generic extension struct where an id determines the data type to be pointed. The first application is to enable multiples in/out syncobj (next patch), but the base is already done for future features.

Signed-off-by: Melissa Wen mwen@igalia.com --- drivers/gpu/drm/v3d/v3d_drv.c | 4 +- drivers/gpu/drm/v3d/v3d_gem.c | 80 ++++++++++++++++++++++++++++++++--- include/uapi/drm/v3d_drm.h | 38 ++++++++++++++++- 3 files changed, 113 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.c b/drivers/gpu/drm/v3d/v3d_drv.c index 9403c3b36aca..6a0516160bb2 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.c +++ b/drivers/gpu/drm/v3d/v3d_drv.c @@ -83,7 +83,6 @@ static int v3d_get_param_ioctl(struct drm_device *dev, void *data, return 0; }

- switch (args->param) { case DRM_V3D_PARAM_SUPPORTS_TFU: args->value = 1; @@ -147,7 +146,7 @@ v3d_postclose(struct drm_device *dev, struct drm_file *file) DEFINE_DRM_GEM_FOPS(v3d_drm_fops);

/* DRM_AUTH is required on SUBMIT_CL for now, while we don't have GMP - * protection between clients. Note that render nodes would be be + * protection between clients. Note that render nodes would be * able to submit CLs that could access BOs from clients authenticated * with the master node. The TFU doesn't use the GMP, so it would * need to stay DRM_AUTH until we do buffer size/offset validation. @@ -222,7 +221,6 @@ static int v3d_platform_drm_probe(struct platform_device *pdev) u32 mmu_debug; u32 ident1;

- v3d = devm_drm_dev_alloc(dev, &v3d_drm_driver, struct v3d_dev, drm); if (IS_ERR(v3d)) return PTR_ERR(v3d); diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 593ed2206d74..e254919b6c5e 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -521,6 +521,38 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, } }

+static int +v3d_get_extensions(struct drm_file *file_priv, + u32 ext_count, u64 ext_handles) +{ + int i; + struct drm_v3d_extension __user *handles; + + if (!ext_count) + return 0; + + handles = u64_to_user_ptr(ext_handles); + for (i = 0; i < ext_count; i++) { + struct drm_v3d_extension ext; + + if (copy_from_user(&ext, handles, sizeof(ext))) { + DRM_DEBUG("Failed to copy submit extension\n"); + return -EFAULT; + } + + switch (ext.id) { + case 0: + default: + DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); + return -EINVAL; + } + + handles = u64_to_user_ptr(ext.next); + } + + return 0; +} + /** * v3d_submit_cl_ioctl() - Submits a job (frame) to the V3D. * @dev: DRM device @@ -549,15 +581,23 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,

trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args->rcl_end);

- if (args->pad != 0) - return -EINVAL; - - if (args->flags != 0 && - args->flags != DRM_V3D_SUBMIT_CL_FLUSH_CACHE) { + if (args->flags && + args->flags & ~(DRM_V3D_SUBMIT_CL_FLUSH_CACHE | + DRM_V3D_SUBMIT_EXTENSION)) { DRM_INFO("invalid flags: %d\n", args->flags); return -EINVAL; }

+ if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { + ret = v3d_get_extensions(file_priv, + args->extension_count, + args->extensions); + if (ret) { + DRM_DEBUG("Failed to get extensions.\n"); + return ret; + } + } + render = kcalloc(1, sizeof(*render), GFP_KERNEL); if (!render) return -ENOMEM; @@ -711,6 +751,21 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,

trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia);

+ if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { + DRM_DEBUG("invalid flags: %d\n", args->flags); + return -EINVAL; + } + + if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { + ret = v3d_get_extensions(file_priv, + args->extension_count, + args->extensions); + if (ret) { + DRM_DEBUG("Failed to get extensions.\n"); + return ret; + } + } + job = kcalloc(1, sizeof(*job), GFP_KERNEL); if (!job) return -ENOMEM; @@ -806,6 +861,21 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, return -EINVAL; }

+ if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { + DRM_DEBUG("Invalid flags: %d\n", args->flags); + return -EINVAL; + } + + if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { + ret = v3d_get_extensions(file_priv, + args->extension_count, + args->extensions); + if (ret) { + DRM_DEBUG("Failed to get extensions.\n"); + return ret; + } + } + job = kcalloc(1, sizeof(*job), GFP_KERNEL); if (!job) return -ENOMEM; diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 4104f22fb3d3..1f4706010eb5 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -58,6 +58,19 @@ extern "C" { struct drm_v3d_perfmon_get_values)

#define DRM_V3D_SUBMIT_CL_FLUSH_CACHE 0x01 +#define DRM_V3D_SUBMIT_EXTENSION 0x02 + +/* struct drm_v3d_extension - ioctl extensions + * + * Linked-list of generic extensions where the id identify which struct is + * pointed by ext_data. Therefore, DRM_V3D_EXT_ID_* is used on id to identify + * the extension type. + */ +struct drm_v3d_extension { + __u64 next; + __u64 ext_data; + __u32 id; +};

/** * struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D @@ -135,12 +148,17 @@ struct drm_v3d_submit_cl { /* Number of BO handles passed in (size is that times 4). */ __u32 bo_handle_count;

+ /* DRM_V3D_SUBMIT_* properties */ __u32 flags;

/* ID of the perfmon to attach to this job. 0 means no perfmon. */ __u32 perfmon_id;

- __u32 pad; + /* Number of extensions*/ + __u32 extension_count; + + /* Pointer to a list of ioctl extensions*/ + __u64 extensions; };

/** @@ -248,6 +266,15 @@ struct drm_v3d_submit_tfu { __u32 in_sync; /* Sync object to signal when the TFU job is done. */ __u32 out_sync; + + /* Number of extensions*/ + __u32 extension_count; + + /* Pointer to an array of ioctl extensions*/ + __u64 extensions; + + /* DRM_V3D_SUBMIT_* properties */ + __u32 flags; };

/* Submits a compute shader for dispatch. This job will block on any @@ -276,6 +303,15 @@ struct drm_v3d_submit_csd {

/* ID of the perfmon to attach to this job. 0 means no perfmon. */ __u32 perfmon_id; + + /* DRM_V3D_SUBMIT_* properties */ + __u32 flags; + + /* Number of extensions*/ + __u32 extension_count; + + /* Pointer to a list of ioctl extensions*/ + __u64 extensions; };

enum {

-- 2.30.2

Iago Toral

15 Sep 15 Sep

1:24 p.m.

New subject: [PATCH 2/3] drm/v3d: add generic ioctl extension

On Wed, 2021-08-18 at 18:56 +0100, Melissa Wen wrote:

...

Add support to attach generic extensions on job submission. This patch is a second prep work to enable multiple syncobjs on job submission. With this work, when the job submission interface needs to be extended to accomodate a new feature, we will use a generic extension struct where an id determines the data type to be pointed. The first application is to enable multiples in/out syncobj (next patch), but the base is already done for future features.

Signed-off-by: Melissa Wen mwen@igalia.com

drivers/gpu/drm/v3d/v3d_drv.c | 4 +- drivers/gpu/drm/v3d/v3d_gem.c | 80 ++++++++++++++++++++++++++++++++- -- include/uapi/drm/v3d_drm.h | 38 ++++++++++++++++- 3 files changed, 113 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.c b/drivers/gpu/drm/v3d/v3d_drv.c index 9403c3b36aca..6a0516160bb2 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.c +++ b/drivers/gpu/drm/v3d/v3d_drv.c @@ -83,7 +83,6 @@ static int v3d_get_param_ioctl(struct drm_device *dev, void *data, return 0; }

switch (args->param) { case DRM_V3D_PARAM_SUPPORTS_TFU: args->value = 1;

@@ -147,7 +146,7 @@ v3d_postclose(struct drm_device *dev, struct drm_file *file) DEFINE_DRM_GEM_FOPS(v3d_drm_fops);

/* DRM_AUTH is required on SUBMIT_CL for now, while we don't have GMP

protection between clients. Note that render nodes would be be

protection between clients. Note that render nodes would be

able to submit CLs that could access BOs from clients

authenticated

with the master node. The TFU doesn't use the GMP, so it would

need to stay DRM_AUTH until we do buffer size/offset validation.

@@ -222,7 +221,6 @@ static int v3d_platform_drm_probe(struct platform_device *pdev) u32 mmu_debug; u32 ident1;

v3d = devm_drm_dev_alloc(dev, &v3d_drm_driver, struct v3d_dev,

drm); if (IS_ERR(v3d)) return PTR_ERR(v3d); diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 593ed2206d74..e254919b6c5e 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -521,6 +521,38 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, } }

+static int +v3d_get_extensions(struct drm_file *file_priv,
   u32 ext_count, u64 ext_handles)
+{
int i;

struct drm_v3d_extension __user *handles;

if (!ext_count)
return 0;
handles = u64_to_user_ptr(ext_handles);

for (i = 0; i < ext_count; i++) {
struct drm_v3d_extension ext;
if (copy_from_user(&ext, handles, sizeof(ext))) {
	DRM_DEBUG("Failed to copy submit extension\n");
	return -EFAULT;
}
switch (ext.id) {
case 0:
default:
	DRM_DEBUG_DRIVER("Unknown extension id: %d\n",
ext.id);
	return -EINVAL;
}
handles = u64_to_user_ptr(ext.next);
}

return 0;
+}

/**

v3d_submit_cl_ioctl() - Submits a job (frame) to the V3D.

@dev: DRM device

@@ -549,15 +581,23 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,

trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args-

...
rcl_end);
if (args->pad != 0)
return -EINVAL;
if (args->flags != 0 &&
   args->flags != DRM_V3D_SUBMIT_CL_FLUSH_CACHE) {
if (args->flags &&
   args->flags & ~(DRM_V3D_SUBMIT_CL_FLUSH_CACHE |
	    DRM_V3D_SUBMIT_EXTENSION)) {
DRM_INFO("invalid flags: %d\n", args->flags); return -EINVAL; }
if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

render = kcalloc(1, sizeof(*render), GFP_KERNEL); if (!render) return -ENOMEM;
@@ -711,6 +751,21 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,

trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia);
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
DRM_DEBUG("invalid flags: %d\n", args->flags);
return -EINVAL;
}

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

job = kcalloc(1, sizeof(*job), GFP_KERNEL); if (!job) return -ENOMEM;
@@ -806,6 +861,21 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, return -EINVAL; }
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
DRM_DEBUG("Invalid flags: %d\n", args->flags);
return -EINVAL;
}

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

job = kcalloc(1, sizeof(*job), GFP_KERNEL); if (!job) return -ENOMEM;
diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 4104f22fb3d3..1f4706010eb5 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -58,6 +58,19 @@ extern "C" { struct drm_v3d_perfmon_get_values)

#define DRM_V3D_SUBMIT_CL_FLUSH_CACHE 0x01 +#define DRM_V3D_SUBMIT_EXTENSION 0x02

+/* struct drm_v3d_extension - ioctl extensions

Linked-list of generic extensions where the id identify which

struct is

pointed by ext_data. Therefore, DRM_V3D_EXT_ID_* is used on id to

identify

the extension type.

*/

+struct drm_v3d_extension {

__u64 next;

__u64 ext_data;

__u32 id;

+};

/**

struct drm_v3d_submit_cl - ioctl argument for submitting commands

to the 3D @@ -135,12 +148,17 @@ struct drm_v3d_submit_cl { /* Number of BO handles passed in (size is that times 4). */ __u32 bo_handle_count;

/* DRM_V3D_SUBMIT_* properties */ __u32 flags;

/* ID of the perfmon to attach to this job. 0 means no perfmon.

*/ __u32 perfmon_id;

__u32 pad;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to a list of ioctl extensions*/

__u64 extensions;

};

/** @@ -248,6 +266,15 @@ struct drm_v3d_submit_tfu { __u32 in_sync; /* Sync object to signal when the TFU job is done. */ __u32 out_sync;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to an array of ioctl extensions*/

__u64 extensions;

/* DRM_V3D_SUBMIT_* properties */

__u32 flags;

A silly nit: maybe put flags before the extension fields above for consistency with the CSD and CL submission commands.

...

};

/* Submits a compute shader for dispatch. This job will block on any @@ -276,6 +303,15 @@ struct drm_v3d_submit_csd {

/* ID of the perfmon to attach to this job. 0 means no perfmon. */ __u32 perfmon_id;

/* DRM_V3D_SUBMIT_* properties */

__u32 flags;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to a list of ioctl extensions*/

__u64 extensions;

};

enum {

Melissa Wen

4:28 p.m.

New subject: [PATCH 2/3] drm/v3d: add generic ioctl extension

On 09/15, Iago Toral wrote:

...

On Wed, 2021-08-18 at 18:56 +0100, Melissa Wen wrote:

...
Add support to attach generic extensions on job submission. This patch is a second prep work to enable multiple syncobjs on job submission. With this work, when the job submission interface needs to be extended to accomodate a new feature, we will use a generic extension struct where an id determines the data type to be pointed. The first application is to enable multiples in/out syncobj (next patch), but the base is already done for future features.

Signed-off-by: Melissa Wen mwen@igalia.com

drivers/gpu/drm/v3d/v3d_drv.c | 4 +- drivers/gpu/drm/v3d/v3d_gem.c | 80 ++++++++++++++++++++++++++++++++- -- include/uapi/drm/v3d_drm.h | 38 ++++++++++++++++- 3 files changed, 113 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.c b/drivers/gpu/drm/v3d/v3d_drv.c index 9403c3b36aca..6a0516160bb2 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.c +++ b/drivers/gpu/drm/v3d/v3d_drv.c @@ -83,7 +83,6 @@ static int v3d_get_param_ioctl(struct drm_device *dev, void *data, return 0; }

switch (args->param) { case DRM_V3D_PARAM_SUPPORTS_TFU: args->value = 1;

@@ -147,7 +146,7 @@ v3d_postclose(struct drm_device *dev, struct drm_file *file) DEFINE_DRM_GEM_FOPS(v3d_drm_fops);

/* DRM_AUTH is required on SUBMIT_CL for now, while we don't have GMP

protection between clients. Note that render nodes would be be

protection between clients. Note that render nodes would be

able to submit CLs that could access BOs from clients

authenticated

with the master node. The TFU doesn't use the GMP, so it would

need to stay DRM_AUTH until we do buffer size/offset validation.

@@ -222,7 +221,6 @@ static int v3d_platform_drm_probe(struct platform_device *pdev) u32 mmu_debug; u32 ident1;

v3d = devm_drm_dev_alloc(dev, &v3d_drm_driver, struct v3d_dev,

drm); if (IS_ERR(v3d)) return PTR_ERR(v3d); diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 593ed2206d74..e254919b6c5e 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -521,6 +521,38 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, } }

+static int +v3d_get_extensions(struct drm_file *file_priv,
   u32 ext_count, u64 ext_handles)
+{
int i;

struct drm_v3d_extension __user *handles;

if (!ext_count)
return 0;
handles = u64_to_user_ptr(ext_handles);

for (i = 0; i < ext_count; i++) {
struct drm_v3d_extension ext;
if (copy_from_user(&ext, handles, sizeof(ext))) {
	DRM_DEBUG("Failed to copy submit extension\n");
	return -EFAULT;
}
switch (ext.id) {
case 0:
default:
	DRM_DEBUG_DRIVER("Unknown extension id: %d\n",
ext.id);
	return -EINVAL;
}
handles = u64_to_user_ptr(ext.next);
}

return 0;
+}

/**

v3d_submit_cl_ioctl() - Submits a job (frame) to the V3D.

@dev: DRM device

@@ -549,15 +581,23 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,

trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args-

...
rcl_end);
if (args->pad != 0)
return -EINVAL;
if (args->flags != 0 &&
   args->flags != DRM_V3D_SUBMIT_CL_FLUSH_CACHE) {
if (args->flags &&
   args->flags & ~(DRM_V3D_SUBMIT_CL_FLUSH_CACHE |
	    DRM_V3D_SUBMIT_EXTENSION)) {
DRM_INFO("invalid flags: %d\n", args->flags); return -EINVAL; }
if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

render = kcalloc(1, sizeof(*render), GFP_KERNEL); if (!render) return -ENOMEM;
@@ -711,6 +751,21 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,

trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia);
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
DRM_DEBUG("invalid flags: %d\n", args->flags);
return -EINVAL;
}

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

job = kcalloc(1, sizeof(*job), GFP_KERNEL); if (!job) return -ENOMEM;
@@ -806,6 +861,21 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, return -EINVAL; }
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
DRM_DEBUG("Invalid flags: %d\n", args->flags);
return -EINVAL;
}

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

job = kcalloc(1, sizeof(*job), GFP_KERNEL); if (!job) return -ENOMEM;
diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 4104f22fb3d3..1f4706010eb5 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -58,6 +58,19 @@ extern "C" { struct drm_v3d_perfmon_get_values)

#define DRM_V3D_SUBMIT_CL_FLUSH_CACHE 0x01 +#define DRM_V3D_SUBMIT_EXTENSION 0x02

+/* struct drm_v3d_extension - ioctl extensions

Linked-list of generic extensions where the id identify which

struct is

pointed by ext_data. Therefore, DRM_V3D_EXT_ID_* is used on id to

identify

the extension type.

*/

+struct drm_v3d_extension {

__u64 next;

__u64 ext_data;

__u32 id;

+};

/**

struct drm_v3d_submit_cl - ioctl argument for submitting commands

to the 3D @@ -135,12 +148,17 @@ struct drm_v3d_submit_cl { /* Number of BO handles passed in (size is that times 4). */ __u32 bo_handle_count;

/* DRM_V3D_SUBMIT_* properties */ __u32 flags;

/* ID of the perfmon to attach to this job. 0 means no perfmon.

*/ __u32 perfmon_id;

__u32 pad;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to a list of ioctl extensions*/

__u64 extensions;

};

/** @@ -248,6 +266,15 @@ struct drm_v3d_submit_tfu { __u32 in_sync; /* Sync object to signal when the TFU job is done. */ __u32 out_sync;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to an array of ioctl extensions*/

__u64 extensions;

/* DRM_V3D_SUBMIT_* properties */

__u32 flags;
A silly nit: maybe put flags before the extension fields above for consistency with the CSD and CL submission commands.

hmm.. I arranged it that way for alignment reasons (afaiu), but I can (or should) include a _u32 pad right after out_sync to set these in the same sequence.

...

...
};

/* Submits a compute shader for dispatch. This job will block on any @@ -276,6 +303,15 @@ struct drm_v3d_submit_csd {

/* ID of the perfmon to attach to this job. 0 means no perfmon. */ __u32 perfmon_id;

/* DRM_V3D_SUBMIT_* properties */

__u32 flags;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to a list of ioctl extensions*/

__u64 extensions;

};

enum {

Iago Toral

4:32 p.m.

New subject: [PATCH 2/3] drm/v3d: add generic ioctl extension

On Wed, 2021-09-15 at 17:28 +0100, Melissa Wen wrote:

...

On 09/15, Iago Toral wrote:

...
On Wed, 2021-08-18 at 18:56 +0100, Melissa Wen wrote:

(...)

...

...
...
/** @@ -248,6 +266,15 @@ struct drm_v3d_submit_tfu { __u32 in_sync; /* Sync object to signal when the TFU job is done. */ __u32 out_sync;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to an array of ioctl extensions*/

__u64 extensions;

/* DRM_V3D_SUBMIT_* properties */

__u32 flags;

A silly nit: maybe put flags before the extension fields above for consistency with the CSD and CL submission commands.

hmm.. I arranged it that way for alignment reasons (afaiu), but I can (or should) include a _u32 pad right after out_sync to set these in the same sequence.

Ah, that's fine, my suggestion was just for style, let's keep it as is.

Iago

Daniel Vetter

16 Sep 16 Sep

12:19 p.m.

New subject: [PATCH 2/3] drm/v3d: add generic ioctl extension

On Wed, Aug 18, 2021 at 06:56:41PM +0100, Melissa Wen wrote:

...

Add support to attach generic extensions on job submission. This patch is a second prep work to enable multiple syncobjs on job submission. With this work, when the job submission interface needs to be extended to accomodate a new feature, we will use a generic extension struct where an id determines the data type to be pointed. The first application is to enable multiples in/out syncobj (next patch), but the base is already done for future features.

Signed-off-by: Melissa Wen mwen@igalia.com

drivers/gpu/drm/v3d/v3d_drv.c | 4 +- drivers/gpu/drm/v3d/v3d_gem.c | 80 ++++++++++++++++++++++++++++++++--- include/uapi/drm/v3d_drm.h | 38 ++++++++++++++++- 3 files changed, 113 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.c b/drivers/gpu/drm/v3d/v3d_drv.c index 9403c3b36aca..6a0516160bb2 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.c +++ b/drivers/gpu/drm/v3d/v3d_drv.c @@ -83,7 +83,6 @@ static int v3d_get_param_ioctl(struct drm_device *dev, void *data, return 0; }

switch (args->param) { case DRM_V3D_PARAM_SUPPORTS_TFU: args->value = 1;

@@ -147,7 +146,7 @@ v3d_postclose(struct drm_device *dev, struct drm_file *file) DEFINE_DRM_GEM_FOPS(v3d_drm_fops);

/* DRM_AUTH is required on SUBMIT_CL for now, while we don't have GMP

protection between clients. Note that render nodes would be be

protection between clients. Note that render nodes would be

able to submit CLs that could access BOs from clients authenticated

with the master node. The TFU doesn't use the GMP, so it would

need to stay DRM_AUTH until we do buffer size/offset validation.

@@ -222,7 +221,6 @@ static int v3d_platform_drm_probe(struct platform_device *pdev) u32 mmu_debug; u32 ident1;

v3d = devm_drm_dev_alloc(dev, &v3d_drm_driver, struct v3d_dev, drm); if (IS_ERR(v3d)) return PTR_ERR(v3d);

diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 593ed2206d74..e254919b6c5e 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -521,6 +521,38 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, } }

+static int +v3d_get_extensions(struct drm_file *file_priv,
   u32 ext_count, u64 ext_handles)
+{
int i;

struct drm_v3d_extension __user *handles;

if (!ext_count)
return 0;
handles = u64_to_user_ptr(ext_handles);

for (i = 0; i < ext_count; i++) {
struct drm_v3d_extension ext;
if (copy_from_user(&ext, handles, sizeof(ext))) {
	DRM_DEBUG("Failed to copy submit extension\n");
	return -EFAULT;
}
switch (ext.id) {
case 0:
default:
	DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id);
	return -EINVAL;
}
handles = u64_to_user_ptr(ext.next);
}

return 0;
+}

/**

v3d_submit_cl_ioctl() - Submits a job (frame) to the V3D.

@dev: DRM device

@@ -549,15 +581,23 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,

trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args->rcl_end);
if (args->pad != 0)
return -EINVAL;
if (args->flags != 0 &&
   args->flags != DRM_V3D_SUBMIT_CL_FLUSH_CACHE) {
if (args->flags &&
   args->flags & ~(DRM_V3D_SUBMIT_CL_FLUSH_CACHE |
	    DRM_V3D_SUBMIT_EXTENSION)) {
DRM_INFO("invalid flags: %d\n", args->flags); return -EINVAL; }
if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

render = kcalloc(1, sizeof(*render), GFP_KERNEL); if (!render) return -ENOMEM;
@@ -711,6 +751,21 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,

trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia);
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
DRM_DEBUG("invalid flags: %d\n", args->flags);
return -EINVAL;
}

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

job = kcalloc(1, sizeof(*job), GFP_KERNEL); if (!job) return -ENOMEM;
@@ -806,6 +861,21 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, return -EINVAL; }
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
DRM_DEBUG("Invalid flags: %d\n", args->flags);
return -EINVAL;
}

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

job = kcalloc(1, sizeof(*job), GFP_KERNEL); if (!job) return -ENOMEM;
diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 4104f22fb3d3..1f4706010eb5 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -58,6 +58,19 @@ extern "C" { struct drm_v3d_perfmon_get_values)

#define DRM_V3D_SUBMIT_CL_FLUSH_CACHE 0x01 +#define DRM_V3D_SUBMIT_EXTENSION 0x02

+/* struct drm_v3d_extension - ioctl extensions

Linked-list of generic extensions where the id identify which struct is

pointed by ext_data. Therefore, DRM_V3D_EXT_ID_* is used on id to identify

the extension type.

*/

+struct drm_v3d_extension {

__u64 next;

Why do you both need a next pointer here and extension_count everywhere? That seems one too much.

...

__u64 ext_data;

This isn't needed if you link them. Instead each extension can subclass this struct here, and add whatever parameter they need there. Some extensions could be just a flag which only needs to be the extension present. Maybe what you want here is a __u32 for flags? Solves also the aligning.

...

__u32 id;

Align to 64bit just to be save.

One thing I wondered is whether we shouldn't lift this to be a drm thing. i915 has something similar with i915_user_extension.

That way we could share some helpers for parsing these, and people would do extensible drm ioctls all the same way? -Daniel

...

+};

/**

struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D

@@ -135,12 +148,17 @@ struct drm_v3d_submit_cl { /* Number of BO handles passed in (size is that times 4). */ __u32 bo_handle_count;

/* DRM_V3D_SUBMIT_* properties */ __u32 flags;

/* ID of the perfmon to attach to this job. 0 means no perfmon. */ __u32 perfmon_id;

__u32 pad;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to a list of ioctl extensions*/

__u64 extensions;

};

/** @@ -248,6 +266,15 @@ struct drm_v3d_submit_tfu { __u32 in_sync; /* Sync object to signal when the TFU job is done. */ __u32 out_sync;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to an array of ioctl extensions*/

__u64 extensions;

/* DRM_V3D_SUBMIT_* properties */

__u32 flags;

};

/* Submits a compute shader for dispatch. This job will block on any @@ -276,6 +303,15 @@ struct drm_v3d_submit_csd {

/* ID of the perfmon to attach to this job. 0 means no perfmon. */ __u32 perfmon_id;

/* DRM_V3D_SUBMIT_* properties */

__u32 flags;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to a list of ioctl extensions*/

__u64 extensions;

};

enum {

2.30.2

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Melissa Wen

8:42 p.m.

New subject: [PATCH 2/3] drm/v3d: add generic ioctl extension

On 09/16, Daniel Vetter wrote:

...

On Wed, Aug 18, 2021 at 06:56:41PM +0100, Melissa Wen wrote:

...
Add support to attach generic extensions on job submission. This patch is a second prep work to enable multiple syncobjs on job submission. With this work, when the job submission interface needs to be extended to accomodate a new feature, we will use a generic extension struct where an id determines the data type to be pointed. The first application is to enable multiples in/out syncobj (next patch), but the base is already done for future features.

Signed-off-by: Melissa Wen mwen@igalia.com

drivers/gpu/drm/v3d/v3d_drv.c | 4 +- drivers/gpu/drm/v3d/v3d_gem.c | 80 ++++++++++++++++++++++++++++++++--- include/uapi/drm/v3d_drm.h | 38 ++++++++++++++++- 3 files changed, 113 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.c b/drivers/gpu/drm/v3d/v3d_drv.c index 9403c3b36aca..6a0516160bb2 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.c +++ b/drivers/gpu/drm/v3d/v3d_drv.c @@ -83,7 +83,6 @@ static int v3d_get_param_ioctl(struct drm_device *dev, void *data, return 0; }

switch (args->param) { case DRM_V3D_PARAM_SUPPORTS_TFU: args->value = 1;

@@ -147,7 +146,7 @@ v3d_postclose(struct drm_device *dev, struct drm_file *file) DEFINE_DRM_GEM_FOPS(v3d_drm_fops);

/* DRM_AUTH is required on SUBMIT_CL for now, while we don't have GMP

protection between clients. Note that render nodes would be be

protection between clients. Note that render nodes would be

able to submit CLs that could access BOs from clients authenticated

with the master node. The TFU doesn't use the GMP, so it would

need to stay DRM_AUTH until we do buffer size/offset validation.

@@ -222,7 +221,6 @@ static int v3d_platform_drm_probe(struct platform_device *pdev) u32 mmu_debug; u32 ident1;

v3d = devm_drm_dev_alloc(dev, &v3d_drm_driver, struct v3d_dev, drm); if (IS_ERR(v3d)) return PTR_ERR(v3d);

diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 593ed2206d74..e254919b6c5e 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -521,6 +521,38 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, } }

+static int +v3d_get_extensions(struct drm_file *file_priv,
   u32 ext_count, u64 ext_handles)
+{
int i;

struct drm_v3d_extension __user *handles;

if (!ext_count)
return 0;
handles = u64_to_user_ptr(ext_handles);

for (i = 0; i < ext_count; i++) {
struct drm_v3d_extension ext;
if (copy_from_user(&ext, handles, sizeof(ext))) {
	DRM_DEBUG("Failed to copy submit extension\n");
	return -EFAULT;
}
switch (ext.id) {
case 0:
default:
	DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id);
	return -EINVAL;
}
handles = u64_to_user_ptr(ext.next);
}

return 0;
+}

/**

v3d_submit_cl_ioctl() - Submits a job (frame) to the V3D.

@dev: DRM device

@@ -549,15 +581,23 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,

trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args->rcl_end);
if (args->pad != 0)
return -EINVAL;
if (args->flags != 0 &&
   args->flags != DRM_V3D_SUBMIT_CL_FLUSH_CACHE) {
if (args->flags &&
   args->flags & ~(DRM_V3D_SUBMIT_CL_FLUSH_CACHE |
	    DRM_V3D_SUBMIT_EXTENSION)) {
DRM_INFO("invalid flags: %d\n", args->flags); return -EINVAL; }
if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

render = kcalloc(1, sizeof(*render), GFP_KERNEL); if (!render) return -ENOMEM;
@@ -711,6 +751,21 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,

trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia);
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
DRM_DEBUG("invalid flags: %d\n", args->flags);
return -EINVAL;
}

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

job = kcalloc(1, sizeof(*job), GFP_KERNEL); if (!job) return -ENOMEM;
@@ -806,6 +861,21 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, return -EINVAL; }
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
DRM_DEBUG("Invalid flags: %d\n", args->flags);
return -EINVAL;
}

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
			 args->extension_count,
			 args->extensions);
if (ret) {
	DRM_DEBUG("Failed to get extensions.\n");
	return ret;
}
}

job = kcalloc(1, sizeof(*job), GFP_KERNEL); if (!job) return -ENOMEM;
diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 4104f22fb3d3..1f4706010eb5 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -58,6 +58,19 @@ extern "C" { struct drm_v3d_perfmon_get_values)

#define DRM_V3D_SUBMIT_CL_FLUSH_CACHE 0x01 +#define DRM_V3D_SUBMIT_EXTENSION 0x02

+/* struct drm_v3d_extension - ioctl extensions

Linked-list of generic extensions where the id identify which struct is

pointed by ext_data. Therefore, DRM_V3D_EXT_ID_* is used on id to identify

the extension type.

*/

+struct drm_v3d_extension {

__u64 next;
Why do you both need a next pointer here and extension_count everywhere? That seems one too much.

...

__u64 ext_data;

This isn't needed if you link them. Instead each extension can subclass this struct here, and add whatever parameter they need there. Some extensions could be just a flag which only needs to be the extension present. Maybe what you want here is a __u32 for flags? Solves also the aligning.

...

__u32 id;

Align to 64bit just to be save.

Hi Daniel,

Thanks for suggesting these improvements. I'll polish it for the next version.

...

One thing I wondered is whether we shouldn't lift this to be a drm thing. i915 has something similar with i915_user_extension.

That way we could share some helpers for parsing these, and people would do extensible drm ioctls all the same way?

I think so. I've based on solutions from i915 and amd drivers for ioctl extension. The motivations presented in the commit message for i915_user_extension (vulkan) are similar to ours, and maybe being in drm will make it simpler for the drivers to adopt too.

Melissa

...

-Daniel

...
+};

/**

struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D

@@ -135,12 +148,17 @@ struct drm_v3d_submit_cl { /* Number of BO handles passed in (size is that times 4). */ __u32 bo_handle_count;

/* DRM_V3D_SUBMIT_* properties */ __u32 flags;

/* ID of the perfmon to attach to this job. 0 means no perfmon. */ __u32 perfmon_id;

__u32 pad;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to a list of ioctl extensions*/

__u64 extensions;

};

/** @@ -248,6 +266,15 @@ struct drm_v3d_submit_tfu { __u32 in_sync; /* Sync object to signal when the TFU job is done. */ __u32 out_sync;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to an array of ioctl extensions*/

__u64 extensions;

/* DRM_V3D_SUBMIT_* properties */

__u32 flags;

};

/* Submits a compute shader for dispatch. This job will block on any @@ -276,6 +303,15 @@ struct drm_v3d_submit_csd {

/* ID of the perfmon to attach to this job. 0 means no perfmon. */ __u32 perfmon_id;

/* DRM_V3D_SUBMIT_* properties */

__u32 flags;

/* Number of extensions*/

__u32 extension_count;

/* Pointer to a list of ioctl extensions*/

__u64 extensions;

};

enum {

2.30.2

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Melissa Wen

18 Aug 18 Aug

5:57 p.m.

New subject: [PATCH 3/3] drm/v3d: add multiple syncobjs support

Using the generic extension support set in the previous patch, this patch enables more than one in/out binary syncobj per job submission. Arrays of syncobjs are set in a specific extension type (multisync) that also cares of determining the stage for sync (bin/render) through a flag - when this is the case.

Signed-off-by: Melissa Wen mwen@igalia.com --- drivers/gpu/drm/v3d/v3d_drv.c | 3 + drivers/gpu/drm/v3d/v3d_drv.h | 14 +++ drivers/gpu/drm/v3d/v3d_gem.c | 209 +++++++++++++++++++++++++++------- include/uapi/drm/v3d_drm.h | 38 +++++++ 4 files changed, 226 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.c b/drivers/gpu/drm/v3d/v3d_drv.c index 6a0516160bb2..939ca8c833f5 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.c +++ b/drivers/gpu/drm/v3d/v3d_drv.c @@ -96,6 +96,9 @@ static int v3d_get_param_ioctl(struct drm_device *dev, void *data, case DRM_V3D_PARAM_SUPPORTS_PERFMON: args->value = (v3d->ver >= 40); return 0; + case DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT: + args->value = 1; + return 0; default: DRM_DEBUG("Unknown parameter %d\n", args->param); return -EINVAL; diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index b900a050d5e2..544c60404a0f 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -294,6 +294,20 @@ struct v3d_csd_job { struct drm_v3d_submit_csd args; };

+struct v3d_submit_outsync { + struct drm_syncobj *syncobj; +}; + +struct v3d_submit_ext { + u32 flags; + + u32 in_sync_count; + u64 in_syncs; + + u32 out_sync_count; + struct v3d_submit_outsync *out_syncs; +}; + /** * __wait_for - magic wait macro * diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index e254919b6c5e..e7aabe1a0e11 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -392,6 +392,9 @@ v3d_render_job_free(struct kref *ref)

void v3d_job_cleanup(struct v3d_job *job) { + if (!job) + return; + drm_sched_job_cleanup(&job->base); v3d_job_put(job); } @@ -451,10 +454,11 @@ v3d_job_add_deps(struct drm_file *file_priv, struct v3d_job *job, static int v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, struct v3d_job *job, void (*free)(struct kref *ref), - u32 in_sync, enum v3d_queue queue) + u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue queue) { struct v3d_file_priv *v3d_priv = file_priv->driver_priv; - int ret; + bool has_multinsync = (se && se->in_sync_count); + int ret, i;

job->v3d = v3d; job->free = free; @@ -463,14 +467,30 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, if (ret < 0) return ret;

- ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], - v3d_priv); + ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], v3d_priv); if (ret) goto fail;

- ret = v3d_job_add_deps(file_priv, job, in_sync, 0); - if (ret) - goto fail_job; + if (has_multinsync && (se->flags == (queue == V3D_RENDER))) { + struct drm_v3d_sem __user *handle = u64_to_user_ptr(se->in_syncs); + + for (i = 0; i < se->in_sync_count; i++) { + struct drm_v3d_sem in; + + ret = copy_from_user(&in, handle++, sizeof(in)); + if (ret) { + DRM_DEBUG("Failed to copy wait dep handle.\n"); + goto fail_job; + } + ret = v3d_job_add_deps(file_priv, job, in.handle, 0); + if (ret) + goto fail_job; + } + } else if (!has_multinsync) { + ret = v3d_job_add_deps(file_priv, job, in_sync, 0); + if (ret) + goto fail_job; + }

kref_init(&job->refcount);

@@ -500,6 +520,7 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, struct v3d_job *job, struct ww_acquire_ctx *acquire_ctx, u32 out_sync, + struct v3d_submit_ext *se, struct dma_fence *done_fence) { struct drm_syncobj *sync_out; @@ -514,6 +535,18 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);

/* Update the return sync object for the job */ + /* If multiples semaphores is supported */ + if (se && se->out_sync_count) { + for (i = 0; i < se->out_sync_count; i++) { + drm_syncobj_replace_fence(se->out_syncs[i].syncobj, + done_fence); + drm_syncobj_put(se->out_syncs[i].syncobj); + } + kvfree(se->out_syncs); + return; + } + + /* Single signal semaphore */ sync_out = drm_syncobj_find(file_priv, out_sync); if (sync_out) { drm_syncobj_replace_fence(sync_out, done_fence); @@ -521,11 +554,93 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, } }

+static void +v3d_put_multisync_post_deps(struct v3d_submit_ext *se) +{ + unsigned int i; + + for (i = 0; i < se->out_sync_count; i++) + drm_syncobj_put(se->out_syncs[i].syncobj); + kvfree(se->out_syncs); +} + +static int +v3d_get_multisync_post_deps(struct drm_file *file_priv, + struct v3d_submit_ext *se, + u32 count, u64 handles) +{ + struct drm_v3d_sem __user *post_deps; + int i, ret; + + se->out_syncs = (struct v3d_submit_outsync *) + kvmalloc_array(count, + sizeof(struct v3d_submit_outsync), + GFP_KERNEL); + if (!se->out_syncs) + return -ENOMEM; + + post_deps = u64_to_user_ptr(handles); + + for (i = 0; i < count; i++) { + struct drm_v3d_sem out; + + ret = copy_from_user(&out, post_deps++, sizeof(out)); + if (ret) { + DRM_DEBUG("Failed to copy post dep handles\n"); + goto fail; + } + + se->out_syncs[i].syncobj = drm_syncobj_find(file_priv, + out.handle); + if (!se->out_syncs[i].syncobj) { + ret = -EINVAL; + goto fail; + } + } + se->out_sync_count = count; + + return 0; + +fail: + for (i--; i >= 0; i--) + drm_syncobj_put(se->out_syncs[i].syncobj); + kvfree(se->out_syncs); + + return ret; +} + +static int +v3d_get_multisync_submit_deps(struct drm_file *file_priv, + struct v3d_submit_ext *se, + u64 ext_data) +{ + struct drm_v3d_multi_sync multisync = {0}; + int ret; + + ret = copy_from_user(&multisync, u64_to_user_ptr(ext_data), + sizeof(multisync)); + if (ret) + return ret; + + ret = v3d_get_multisync_post_deps(file_priv, se, multisync.out_sync_count, + multisync.out_syncs); + if (ret) + return ret; + + se->in_sync_count = multisync.in_sync_count; + se->in_syncs = multisync.in_syncs; + + se->flags = multisync.flags; + + return 0; +} + static int v3d_get_extensions(struct drm_file *file_priv, + struct v3d_submit_ext *se, u32 ext_count, u64 ext_handles) { - int i; + int i, ret; struct drm_v3d_extension __user *handles;

if (!ext_count) @@ -541,7 +656,12 @@ v3d_get_extensions(struct drm_file *file_priv, }

switch (ext.id) { - case 0: + case DRM_V3D_EXT_ID_MULTI_SYNC: + ret = v3d_get_multisync_submit_deps(file_priv, se, + ext.ext_data); + if (ret) + return ret; + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -572,6 +692,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, struct v3d_dev *v3d = to_v3d_dev(dev); struct v3d_file_priv *v3d_priv = file_priv->driver_priv; struct drm_v3d_submit_cl *args = data; + struct v3d_submit_ext se = {0}; struct v3d_bin_job *bin = NULL; struct v3d_render_job *render; struct v3d_job *clean_job = NULL; @@ -589,7 +710,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, }

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, + ret = v3d_get_extensions(file_priv, &se, args->extension_count, args->extensions); if (ret) { @@ -599,33 +720,35 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, }

render = kcalloc(1, sizeof(*render), GFP_KERNEL); - if (!render) + if (!render) { + v3d_put_multisync_post_deps(&se); return -ENOMEM; + }

render->start = args->rcl_start; render->end = args->rcl_end; INIT_LIST_HEAD(&render->unref_list);

- ret = v3d_job_init(v3d, file_priv, &render->base, - v3d_render_job_free, args->in_sync_rcl, V3D_RENDER); + ret = v3d_job_init(v3d, file_priv, &render->base, v3d_render_job_free, + args->in_sync_rcl, &se, V3D_RENDER); if (ret) { kfree(render); + v3d_put_multisync_post_deps(&se); return ret; }

if (args->bcl_start != args->bcl_end) { bin = kcalloc(1, sizeof(*bin), GFP_KERNEL); if (!bin) { - v3d_job_put(&render->base); - return -ENOMEM; + ret = -ENOMEM; + goto fail; }

ret = v3d_job_init(v3d, file_priv, &bin->base, - v3d_job_free, args->in_sync_bcl, V3D_BIN); + v3d_job_free, args->in_sync_bcl, &se, V3D_BIN); if (ret) { - v3d_job_put(&render->base); kfree(bin); - return ret; + goto fail; }

bin->start = args->bcl_start; @@ -643,7 +766,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, goto fail; }

- ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, V3D_CACHE_CLEAN); + ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, 0, V3D_CACHE_CLEAN); if (ret) { kfree(clean_job); clean_job = NULL; @@ -706,6 +829,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, last_job, &acquire_ctx, args->out_sync, + &se, last_job->done_fence);

if (bin) @@ -721,11 +845,10 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, drm_gem_unlock_reservations(last_job->bo, last_job->bo_count, &acquire_ctx); fail: - if (bin) - v3d_job_cleanup(&bin->base); + v3d_job_cleanup(&bin->base); v3d_job_cleanup(&render->base); - if (clean_job) - v3d_job_cleanup(clean_job); + v3d_job_cleanup(clean_job); + v3d_put_multisync_post_deps(&se);

return ret; } @@ -745,6 +868,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, { struct v3d_dev *v3d = to_v3d_dev(dev); struct drm_v3d_submit_tfu *args = data; + struct v3d_submit_ext se = {0}; struct v3d_tfu_job *job; struct ww_acquire_ctx acquire_ctx; int ret = 0; @@ -757,7 +881,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, }

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, + ret = v3d_get_extensions(file_priv, &se, args->extension_count, args->extensions); if (ret) { @@ -767,21 +891,24 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, }

job = kcalloc(1, sizeof(*job), GFP_KERNEL); - if (!job) + if (!job) { + v3d_put_multisync_post_deps(&se); return -ENOMEM; + }

ret = v3d_job_init(v3d, file_priv, &job->base, - v3d_job_free, args->in_sync, V3D_TFU); + v3d_job_free, args->in_sync, &se, V3D_TFU); if (ret) { kfree(job); + v3d_put_multisync_post_deps(&se); return ret; }

job->base.bo = kcalloc(ARRAY_SIZE(args->bo_handles), sizeof(*job->base.bo), GFP_KERNEL); if (!job->base.bo) { - v3d_job_put(&job->base); - return -ENOMEM; + ret = -ENOMEM; + goto fail; }

job->args = *args; @@ -821,6 +948,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, v3d_attach_fences_and_unlock_reservation(file_priv, &job->base, &acquire_ctx, args->out_sync, + &se, job->base.done_fence);

v3d_job_put(&job->base); @@ -829,6 +957,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,

fail: v3d_job_cleanup(&job->base); + v3d_put_multisync_post_deps(&se);

return ret; } @@ -849,8 +978,9 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, struct v3d_dev *v3d = to_v3d_dev(dev); struct v3d_file_priv *v3d_priv = file_priv->driver_priv; struct drm_v3d_submit_csd *args = data; + struct v3d_submit_ext se = {0}; struct v3d_csd_job *job; - struct v3d_job *clean_job; + struct v3d_job *clean_job = NULL; struct ww_acquire_ctx acquire_ctx; int ret;

@@ -867,7 +997,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, }

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, + ret = v3d_get_extensions(file_priv, &se, args->extension_count, args->extensions); if (ret) { @@ -877,28 +1007,29 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, }

job = kcalloc(1, sizeof(*job), GFP_KERNEL); - if (!job) + if (!job) { + v3d_put_multisync_post_deps(&se); return -ENOMEM; + }

ret = v3d_job_init(v3d, file_priv, &job->base, - v3d_job_free, args->in_sync, V3D_CSD); + v3d_job_free, args->in_sync, &se, V3D_CSD); if (ret) { kfree(job); + v3d_put_multisync_post_deps(&se); return ret; }

clean_job = kcalloc(1, sizeof(*clean_job), GFP_KERNEL); if (!clean_job) { - v3d_job_put(&job->base); - kfree(job); - return -ENOMEM; + ret = -ENOMEM; + goto fail; }

- ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, V3D_CACHE_CLEAN); + ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, 0, V3D_CACHE_CLEAN); if (ret) { - v3d_job_put(&job->base); kfree(clean_job); - return ret; + goto fail; }

job->args = *args; @@ -936,6 +1067,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, clean_job, &acquire_ctx, args->out_sync, + &se, clean_job->done_fence);

v3d_job_put(&job->base); @@ -950,6 +1082,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, fail: v3d_job_cleanup(&job->base); v3d_job_cleanup(clean_job); + v3d_put_multisync_post_deps(&se);

return ret; } diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 1f4706010eb5..bbb904c521b4 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -60,6 +60,42 @@ extern "C" { #define DRM_V3D_SUBMIT_CL_FLUSH_CACHE 0x01 #define DRM_V3D_SUBMIT_EXTENSION 0x02

+/* struct drm_v3d_sem - wait/signal semaphore + * + * If binary semaphore, it only takes syncobj handle and ignores flags and + * point fields. Point is defined for timeline syncobj feature. + */ +struct drm_v3d_sem { + __u32 handle; /* syncobj */ + /* rsv below, for future uses */ + __u32 flags; + __u64 point; /* for timeline sem support */ + __u64 mbz[2]; /* must be zero, rsv */ +}; + +/** + * struct drm_v3d_multi_sync - ioctl extension to add support multiples + * syncobjs for commands submission. + * + * When an extension of DRM_V3D_EXT_ID_MULTI_SYNC id is defined, it points to + * this extension to define wait and signal dependencies, instead of single + * in/out sync entries on submitting commands. The field flags is used to + * determine the stage to set wait dependencies. + */ +struct drm_v3d_multi_sync { + /* Array of wait and signal semaphores */ + __u64 in_syncs; + __u64 out_syncs; + + /* Number of entries */ + __u32 in_sync_count; + __u32 out_sync_count; + + /* in_sync on render stage */ + __u32 flags; +#define DRM_V3D_IN_SYNC_RCL 0x01 +}; + /* struct drm_v3d_extension - ioctl extensions * * Linked-list of generic extensions where the id identify which struct is @@ -70,6 +106,7 @@ struct drm_v3d_extension { __u64 next; __u64 ext_data; __u32 id; +#define DRM_V3D_EXT_ID_MULTI_SYNC 0x01 };

/** @@ -228,6 +265,7 @@ enum drm_v3d_param { DRM_V3D_PARAM_SUPPORTS_CSD, DRM_V3D_PARAM_SUPPORTS_CACHE_FLUSH, DRM_V3D_PARAM_SUPPORTS_PERFMON, + DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT, };

struct drm_v3d_get_param {

-- 2.30.2

Iago Toral

16 Sep 16 Sep

11:30 a.m.

New subject: [PATCH 3/3] drm/v3d: add multiple syncobjs support

I think this looks good overall, I have a few questions/commments below:

On Wed, 2021-08-18 at 18:57 +0100, Melissa Wen wrote:

...

Using the generic extension support set in the previous patch, this patch enables more than one in/out binary syncobj per job submission. Arrays of syncobjs are set in a specific extension type (multisync) that also cares of determining the stage for sync (bin/render) through a flag - when this is the case.

Signed-off-by: Melissa Wen mwen@igalia.com

drivers/gpu/drm/v3d/v3d_drv.c | 3 + drivers/gpu/drm/v3d/v3d_drv.h | 14 +++ drivers/gpu/drm/v3d/v3d_gem.c | 209 +++++++++++++++++++++++++++----- -- include/uapi/drm/v3d_drm.h | 38 +++++++ 4 files changed, 226 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.c b/drivers/gpu/drm/v3d/v3d_drv.c index 6a0516160bb2..939ca8c833f5 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.c +++ b/drivers/gpu/drm/v3d/v3d_drv.c @@ -96,6 +96,9 @@ static int v3d_get_param_ioctl(struct drm_device *dev, void *data, case DRM_V3D_PARAM_SUPPORTS_PERFMON: args->value = (v3d->ver >= 40); return 0;
case DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT:
args->value = 1;
return 0;
default: DRM_DEBUG("Unknown parameter %d\n", args->param); return -EINVAL;
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index b900a050d5e2..544c60404a0f 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -294,6 +294,20 @@ struct v3d_csd_job { struct drm_v3d_submit_csd args; };

+struct v3d_submit_outsync {

struct drm_syncobj *syncobj;

+};

+struct v3d_submit_ext {

u32 flags;

u32 in_sync_count;

u64 in_syncs;

u32 out_sync_count;

struct v3d_submit_outsync *out_syncs;

+};

/**

__wait_for - magic wait macro

diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index e254919b6c5e..e7aabe1a0e11 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -392,6 +392,9 @@ v3d_render_job_free(struct kref *ref)

void v3d_job_cleanup(struct v3d_job *job) {
if (!job)
return;
drm_sched_job_cleanup(&job->base); v3d_job_put(job);
} @@ -451,10 +454,11 @@ v3d_job_add_deps(struct drm_file *file_priv, struct v3d_job *job, static int v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, struct v3d_job *job, void (*free)(struct kref *ref),
    u32 in_sync, enum v3d_queue queue)
    u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue
queue) { struct v3d_file_priv *v3d_priv = file_priv->driver_priv;

int ret;

bool has_multinsync = (se && se->in_sync_count);

int ret, i;

job->v3d = v3d; job->free = free;

@@ -463,14 +467,30 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, if (ret < 0) return ret;

ret = drm_sched_job_init(&job->base, &v3d_priv-

...
sched_entity[queue],
		 v3d_priv);
ret = drm_sched_job_init(&job->base, &v3d_priv-

...
sched_entity[queue], v3d_priv);

if (ret) goto fail;
ret = v3d_job_add_deps(file_priv, job, in_sync, 0);

if (ret)
goto fail_job;
if (has_multinsync && (se->flags == (queue == V3D_RENDER))) {

I think this is unnecessarily difficult to understand, I'd suggest to code the condition more explicitly:

if (has_multisync && (see->flags & DRM_V3D_IN_SYNC_RCL) && queue == V3D_RENDER)

unless I am missing the point here :)

...

struct drm_v3d_sem __user *handle = u64_to_user_ptr(se-
...
in_syncs);
for (i = 0; i < se->in_sync_count; i++) {
	struct drm_v3d_sem in;
	ret = copy_from_user(&in, handle++, sizeof(
in));
	if (ret) {
		DRM_DEBUG("Failed to copy wait dep
handle.\n");
		goto fail_job;
	}
	ret = v3d_job_add_deps(file_priv, job,
in.handle, 0);
	if (ret)
		goto fail_job;
}
} else if (!has_multinsync) {
ret = v3d_job_add_deps(file_priv, job, in_sync, 0);
if (ret)
	goto fail_job;
}

kref_init(&job->refcount);
@@ -500,6 +520,7 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, struct v3d_job *job, struct ww_acquire_ctx *acquire_ctx, u32 out_sync,
			 struct v3d_submit_ext *se,
		 struct dma_fence *done_fence)
{ struct drm_syncobj *sync_out; @@ -514,6 +535,18 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);

/* Update the return sync object for the job */
/* If multiples semaphores is supported */

if (se && se->out_sync_count) {
for (i = 0; i < se->out_sync_count; i++) {
	drm_syncobj_replace_fence(se-
...
out_syncs[i].syncobj,
				  done_fence);
	drm_syncobj_put(se->out_syncs[i].syncobj);
}
kvfree(se->out_syncs);
return;
}

/* Single signal semaphore */ sync_out = drm_syncobj_find(file_priv, out_sync); if (sync_out) { drm_syncobj_replace_fence(sync_out, done_fence);
@@ -521,11 +554,93 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, } }

+static void +v3d_put_multisync_post_deps(struct v3d_submit_ext *se) +{
unsigned int i;

for (i = 0; i < se->out_sync_count; i++)
drm_syncobj_put(se->out_syncs[i].syncobj);
kvfree(se->out_syncs);
+}

+static int +v3d_get_multisync_post_deps(struct drm_file *file_priv,
	    struct v3d_submit_ext *se,
	    u32 count, u64 handles)
+{
struct drm_v3d_sem __user *post_deps;

int i, ret;

se->out_syncs = (struct v3d_submit_outsync *)
	kvmalloc_array(count,
		       sizeof(struct
v3d_submit_outsync),
		       GFP_KERNEL);
if (!se->out_syncs)
return -ENOMEM;
post_deps = u64_to_user_ptr(handles);

for (i = 0; i < count; i++) {
struct drm_v3d_sem out;
ret = copy_from_user(&out, post_deps++, sizeof(out));
if (ret) {
	DRM_DEBUG("Failed to copy post dep handles\n");
	goto fail;
}
se->out_syncs[i].syncobj = drm_syncobj_find(file_priv,
					    out.handle)
;
if (!se->out_syncs[i].syncobj) {
	ret = -EINVAL;
	goto fail;
}
}

se->out_sync_count = count;

return 0;
+fail:
for (i--; i >= 0; i--)
drm_syncobj_put(se->out_syncs[i].syncobj);
kvfree(se->out_syncs);

return ret;
+}

+static int +v3d_get_multisync_submit_deps(struct drm_file *file_priv,
	      struct v3d_submit_ext *se,
	      u64 ext_data)
+{
struct drm_v3d_multi_sync multisync = {0};

int ret;

ret = copy_from_user(&multisync, u64_to_user_ptr(ext_data),
	     sizeof(multisync));
if (ret)
return ret;
ret = v3d_get_multisync_post_deps(file_priv, se,
multisync.out_sync_count,
			  multisync.out_syncs);
if (ret)
return ret;
se->in_sync_count = multisync.in_sync_count;

se->in_syncs = multisync.in_syncs;

se->flags = multisync.flags;

return 0;
+}

static int v3d_get_extensions(struct drm_file *file_priv,
   struct v3d_submit_ext *se,
u32 ext_count, u64 ext_handles)
{

int i;

int i, ret; struct drm_v3d_extension __user *handles;

if (!ext_count)

@@ -541,7 +656,12 @@ v3d_get_extensions(struct drm_file *file_priv, }
switch (ext.id) {
case 0:
case DRM_V3D_EXT_ID_MULTI_SYNC:
	ret = v3d_get_multisync_submit_deps(file_priv,
se,
					    ext.ext_dat
a);
	if (ret)
		return ret;
	break;
default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n",
ext.id); return -EINVAL; @@ -572,6 +692,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, struct v3d_dev *v3d = to_v3d_dev(dev); struct v3d_file_priv *v3d_priv = file_priv->driver_priv; struct drm_v3d_submit_cl *args = data;

struct v3d_submit_ext se = {0}; struct v3d_bin_job *bin = NULL; struct v3d_render_job *render; struct v3d_job *clean_job = NULL;

@@ -589,7 +710,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, }

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
ret = v3d_get_extensions(file_priv, &se,
		 args->extension_count,
		 args->extensions);
if (ret) {
@@ -599,33 +720,35 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, }

render = kcalloc(1, sizeof(*render), GFP_KERNEL);

if (!render)
if (!render) {
v3d_put_multisync_post_deps(&se);
return -ENOMEM;
}

render->start = args->rcl_start; render->end = args->rcl_end; INIT_LIST_HEAD(&render->unref_list);
ret = v3d_job_init(v3d, file_priv, &render->base,
	   v3d_render_job_free, args->in_sync_rcl,
V3D_RENDER);

ret = v3d_job_init(v3d, file_priv, &render->base,

v3d_render_job_free,
	   args->in_sync_rcl, &se, V3D_RENDER);
if (ret) { kfree(render);
v3d_put_multisync_post_deps(&se);
return ret; }

if (args->bcl_start != args->bcl_end) { bin = kcalloc(1, sizeof(*bin), GFP_KERNEL); if (!bin) {
	v3d_job_put(&render->base);
	return -ENOMEM;
	ret = -ENOMEM;
	goto fail;

We are now calling v3d_job_cleanup where we would call v3d_job_put before for error codepaths. Is it safe to call this if we fail job creation before calling drm_sched_job_init like here?

The documentations says that this cleans up the resources allocated with drm_sched_job_init, which he haven't called yet.

...

}

ret = v3d_job_init(v3d, file_priv, &bin->base,
		   v3d_job_free, args->in_sync_bcl,
V3D_BIN);
		   v3d_job_free, args->in_sync_bcl,
&se, V3D_BIN); if (ret) {
	v3d_job_put(&render->base);
kfree(bin);
	return ret;
	goto fail;
}

bin->start = args->bcl_start;
@@ -643,7 +766,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, goto fail; }
ret = v3d_job_init(v3d, file_priv, clean_job,
v3d_job_free, 0, V3D_CACHE_CLEAN);
ret = v3d_job_init(v3d, file_priv, clean_job,
v3d_job_free, 0, 0, V3D_CACHE_CLEAN); if (ret) { kfree(clean_job); clean_job = NULL; @@ -706,6 +829,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, last_job, &acquire_ctx, args->out_sync,
				 &se,
			 last_job->done_fence);
if (bin)
@@ -721,11 +845,10 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, drm_gem_unlock_reservations(last_job->bo, last_job->bo_count, &acquire_ctx); fail:
if (bin)
v3d_job_cleanup(&bin->base);
v3d_job_cleanup(&bin->base); v3d_job_cleanup(&render->base);
if (clean_job)
v3d_job_cleanup(clean_job);
v3d_job_cleanup(clean_job);

v3d_put_multisync_post_deps(&se);

return ret;

} @@ -745,6 +868,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, { struct v3d_dev *v3d = to_v3d_dev(dev); struct drm_v3d_submit_tfu *args = data;

struct v3d_submit_ext se = {0}; struct v3d_tfu_job *job; struct ww_acquire_ctx acquire_ctx; int ret = 0;

@@ -757,7 +881,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, }

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
ret = v3d_get_extensions(file_priv, &se,
		 args->extension_count,
		 args->extensions);
if (ret) {
@@ -767,21 +891,24 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, }

job = kcalloc(1, sizeof(*job), GFP_KERNEL);

if (!job)
if (!job) {
v3d_put_multisync_post_deps(&se);
return -ENOMEM;
}

ret = v3d_job_init(v3d, file_priv, &job->base,
	   v3d_job_free, args->in_sync, V3D_TFU);
	   v3d_job_free, args->in_sync, &se, V3D_TFU);
if (ret) { kfree(job);
v3d_put_multisync_post_deps(&se);
return ret; }

job->base.bo = kcalloc(ARRAY_SIZE(args->bo_handles), sizeof(*job->base.bo), GFP_KERNEL); if (!job->base.bo) {
v3d_job_put(&job->base);
return -ENOMEM;
ret = -ENOMEM;
goto fail;
}

job->args = *args;
@@ -821,6 +948,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, v3d_attach_fences_and_unlock_reservation(file_priv, &job->base, &acquire_ctx, args->out_sync,
				 &se,
			 job->base.done_fence);
v3d_job_put(&job->base);
@@ -829,6 +957,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,

fail: v3d_job_cleanup(&job->base);

v3d_put_multisync_post_deps(&se);

return ret;

} @@ -849,8 +978,9 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, struct v3d_dev *v3d = to_v3d_dev(dev); struct v3d_file_priv *v3d_priv = file_priv->driver_priv; struct drm_v3d_submit_csd *args = data;

struct v3d_submit_ext se = {0}; struct v3d_csd_job *job;

struct v3d_job *clean_job;

struct v3d_job *clean_job = NULL; struct ww_acquire_ctx acquire_ctx; int ret;

@@ -867,7 +997,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, }

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
ret = v3d_get_extensions(file_priv, &se,
		 args->extension_count,
		 args->extensions);
if (ret) {
@@ -877,28 +1007,29 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, }

job = kcalloc(1, sizeof(*job), GFP_KERNEL);

if (!job)
if (!job) {
v3d_put_multisync_post_deps(&se);
return -ENOMEM;
}

ret = v3d_job_init(v3d, file_priv, &job->base,
	   v3d_job_free, args->in_sync, V3D_CSD);
	   v3d_job_free, args->in_sync, &se, V3D_CSD);
if (ret) { kfree(job);
v3d_put_multisync_post_deps(&se);
return ret; }

clean_job = kcalloc(1, sizeof(*clean_job), GFP_KERNEL); if (!clean_job) {
v3d_job_put(&job->base);
kfree(job);
return -ENOMEM;
ret = -ENOMEM;
goto fail;
}
ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0,

V3D_CACHE_CLEAN);

ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0,

0, V3D_CACHE_CLEAN); if (ret) {
v3d_job_put(&job->base);
kfree(clean_job);
return ret;
goto fail;
}

job->args = *args;
@@ -936,6 +1067,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, clean_job, &acquire_ctx, args->out_sync,
				 &se,
			 clean_job-
...
done_fence);

v3d_job_put(&job->base); @@ -950,6 +1082,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, fail: v3d_job_cleanup(&job->base); v3d_job_cleanup(clean_job);

v3d_put_multisync_post_deps(&se);

return ret;

} diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 1f4706010eb5..bbb904c521b4 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -60,6 +60,42 @@ extern "C" { #define DRM_V3D_SUBMIT_CL_FLUSH_CACHE 0x01 #define DRM_V3D_SUBMIT_EXTENSION 0x02

+/* struct drm_v3d_sem - wait/signal semaphore

If binary semaphore, it only takes syncobj handle and ignores

flags and

point fields. Point is defined for timeline syncobj feature.

*/

+struct drm_v3d_sem {

__u32 handle; /* syncobj */

/* rsv below, for future uses */

__u32 flags;

__u64 point; /* for timeline sem support */

__u64 mbz[2]; /* must be zero, rsv */

+};

I guess the idea here would be that we would check handle and/or point for whether they have a valid value to decide what type of semaphore this is, right?

...

+/**

struct drm_v3d_multi_sync - ioctl extension to add support

multiples

syncobjs for commands submission.

When an extension of DRM_V3D_EXT_ID_MULTI_SYNC id is defined, it

points to

this extension to define wait and signal dependencies, instead of

single

in/out sync entries on submitting commands. The field flags is

used to

determine the stage to set wait dependencies.

*/

+struct drm_v3d_multi_sync {

/* Array of wait and signal semaphores */

__u64 in_syncs;

__u64 out_syncs;

/* Number of entries */

__u32 in_sync_count;

__u32 out_sync_count;

/* in_sync on render stage */

__u32 flags;

+#define DRM_V3D_IN_SYNC_RCL 0x01 +};

/* struct drm_v3d_extension - ioctl extensions

Linked-list of generic extensions where the id identify which

struct is @@ -70,6 +106,7 @@ struct drm_v3d_extension { __u64 next; __u64 ext_data; __u32 id; +#define DRM_V3D_EXT_ID_MULTI_SYNC 0x01 };

/** @@ -228,6 +265,7 @@ enum drm_v3d_param { DRM_V3D_PARAM_SUPPORTS_CSD, DRM_V3D_PARAM_SUPPORTS_CACHE_FLUSH, DRM_V3D_PARAM_SUPPORTS_PERFMON,

DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT,

};

struct drm_v3d_get_param {

Melissa Wen

4:39 p.m.

New subject: [PATCH 3/3] drm/v3d: add multiple syncobjs support

On 09/16, Iago Toral wrote:

...

I think this looks good overall, I have a few questions/commments below:

On Wed, 2021-08-18 at 18:57 +0100, Melissa Wen wrote:

...
Using the generic extension support set in the previous patch, this patch enables more than one in/out binary syncobj per job submission. Arrays of syncobjs are set in a specific extension type (multisync) that also cares of determining the stage for sync (bin/render) through a flag - when this is the case.

Signed-off-by: Melissa Wen mwen@igalia.com

drivers/gpu/drm/v3d/v3d_drv.c | 3 + drivers/gpu/drm/v3d/v3d_drv.h | 14 +++ drivers/gpu/drm/v3d/v3d_gem.c | 209 +++++++++++++++++++++++++++----- -- include/uapi/drm/v3d_drm.h | 38 +++++++ 4 files changed, 226 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.c b/drivers/gpu/drm/v3d/v3d_drv.c index 6a0516160bb2..939ca8c833f5 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.c +++ b/drivers/gpu/drm/v3d/v3d_drv.c @@ -96,6 +96,9 @@ static int v3d_get_param_ioctl(struct drm_device *dev, void *data, case DRM_V3D_PARAM_SUPPORTS_PERFMON: args->value = (v3d->ver >= 40); return 0;
case DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT:
args->value = 1;
return 0;
default: DRM_DEBUG("Unknown parameter %d\n", args->param); return -EINVAL;
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index b900a050d5e2..544c60404a0f 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -294,6 +294,20 @@ struct v3d_csd_job { struct drm_v3d_submit_csd args; };

+struct v3d_submit_outsync {

struct drm_syncobj *syncobj;

+};

+struct v3d_submit_ext {

u32 flags;

u32 in_sync_count;

u64 in_syncs;

u32 out_sync_count;

struct v3d_submit_outsync *out_syncs;

+};

/**

__wait_for - magic wait macro

diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index e254919b6c5e..e7aabe1a0e11 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -392,6 +392,9 @@ v3d_render_job_free(struct kref *ref)

void v3d_job_cleanup(struct v3d_job *job) {
if (!job)
return;
drm_sched_job_cleanup(&job->base); v3d_job_put(job);
} @@ -451,10 +454,11 @@ v3d_job_add_deps(struct drm_file *file_priv, struct v3d_job *job, static int v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, struct v3d_job *job, void (*free)(struct kref *ref),
    u32 in_sync, enum v3d_queue queue)
    u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue
queue) { struct v3d_file_priv *v3d_priv = file_priv->driver_priv;

int ret;

bool has_multinsync = (se && se->in_sync_count);

int ret, i;

job->v3d = v3d; job->free = free;

@@ -463,14 +467,30 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, if (ret < 0) return ret;

ret = drm_sched_job_init(&job->base, &v3d_priv-

...
sched_entity[queue],
		 v3d_priv);
ret = drm_sched_job_init(&job->base, &v3d_priv-

...
sched_entity[queue], v3d_priv);

if (ret) goto fail;
ret = v3d_job_add_deps(file_priv, job, in_sync, 0);

if (ret)
goto fail_job;
if (has_multinsync && (se->flags == (queue == V3D_RENDER))) {
I think this is unnecessarily difficult to understand, I'd suggest to code the condition more explicitly:

if (has_multisync && (see->flags & DRM_V3D_IN_SYNC_RCL) && queue == V3D_RENDER)

unless I am missing the point here :)

more or less it.. the condition should be true if (IN_SYNC_RCL flag and queue is V3_RENDER) and also if (no IN_SYNC_RCL and queue is not V3_RENDER).

anyway, you are right it needs to be improved and more understable, I will work on it for the next version.

...

...
struct drm_v3d_sem __user *handle = u64_to_user_ptr(se-
...
in_syncs);
for (i = 0; i < se->in_sync_count; i++) {
	struct drm_v3d_sem in;
	ret = copy_from_user(&in, handle++, sizeof(
in));
	if (ret) {
		DRM_DEBUG("Failed to copy wait dep
handle.\n");
		goto fail_job;
	}
	ret = v3d_job_add_deps(file_priv, job,
in.handle, 0);
	if (ret)
		goto fail_job;
}
} else if (!has_multinsync) {
ret = v3d_job_add_deps(file_priv, job, in_sync, 0);
if (ret)
	goto fail_job;
}

kref_init(&job->refcount);
@@ -500,6 +520,7 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, struct v3d_job *job, struct ww_acquire_ctx *acquire_ctx, u32 out_sync,
			 struct v3d_submit_ext *se,
		 struct dma_fence *done_fence)
{ struct drm_syncobj *sync_out; @@ -514,6 +535,18 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);

/* Update the return sync object for the job */
/* If multiples semaphores is supported */

if (se && se->out_sync_count) {
for (i = 0; i < se->out_sync_count; i++) {
	drm_syncobj_replace_fence(se-
...
out_syncs[i].syncobj,
				  done_fence);
	drm_syncobj_put(se->out_syncs[i].syncobj);
}
kvfree(se->out_syncs);
return;
}

/* Single signal semaphore */ sync_out = drm_syncobj_find(file_priv, out_sync); if (sync_out) { drm_syncobj_replace_fence(sync_out, done_fence);
@@ -521,11 +554,93 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, } }

+static void +v3d_put_multisync_post_deps(struct v3d_submit_ext *se) +{
unsigned int i;

for (i = 0; i < se->out_sync_count; i++)
drm_syncobj_put(se->out_syncs[i].syncobj);
kvfree(se->out_syncs);
+}

+static int +v3d_get_multisync_post_deps(struct drm_file *file_priv,
	    struct v3d_submit_ext *se,
	    u32 count, u64 handles)
+{
struct drm_v3d_sem __user *post_deps;

int i, ret;

se->out_syncs = (struct v3d_submit_outsync *)
	kvmalloc_array(count,
		       sizeof(struct
v3d_submit_outsync),
		       GFP_KERNEL);
if (!se->out_syncs)
return -ENOMEM;
post_deps = u64_to_user_ptr(handles);

for (i = 0; i < count; i++) {
struct drm_v3d_sem out;
ret = copy_from_user(&out, post_deps++, sizeof(out));
if (ret) {
	DRM_DEBUG("Failed to copy post dep handles\n");
	goto fail;
}
se->out_syncs[i].syncobj = drm_syncobj_find(file_priv,
					    out.handle)
;
if (!se->out_syncs[i].syncobj) {
	ret = -EINVAL;
	goto fail;
}
}

se->out_sync_count = count;

return 0;
+fail:
for (i--; i >= 0; i--)
drm_syncobj_put(se->out_syncs[i].syncobj);
kvfree(se->out_syncs);

return ret;
+}

+static int +v3d_get_multisync_submit_deps(struct drm_file *file_priv,
	      struct v3d_submit_ext *se,
	      u64 ext_data)
+{
struct drm_v3d_multi_sync multisync = {0};

int ret;

ret = copy_from_user(&multisync, u64_to_user_ptr(ext_data),
	     sizeof(multisync));
if (ret)
return ret;
ret = v3d_get_multisync_post_deps(file_priv, se,
multisync.out_sync_count,
			  multisync.out_syncs);
if (ret)
return ret;
se->in_sync_count = multisync.in_sync_count;

se->in_syncs = multisync.in_syncs;

se->flags = multisync.flags;

return 0;
+}

static int v3d_get_extensions(struct drm_file *file_priv,
   struct v3d_submit_ext *se,
u32 ext_count, u64 ext_handles)
{

int i;

int i, ret; struct drm_v3d_extension __user *handles;

if (!ext_count)

@@ -541,7 +656,12 @@ v3d_get_extensions(struct drm_file *file_priv, }
switch (ext.id) {
case 0:
case DRM_V3D_EXT_ID_MULTI_SYNC:
	ret = v3d_get_multisync_submit_deps(file_priv,
se,
					    ext.ext_dat
a);
	if (ret)
		return ret;
	break;
default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n",
ext.id); return -EINVAL; @@ -572,6 +692,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, struct v3d_dev *v3d = to_v3d_dev(dev); struct v3d_file_priv *v3d_priv = file_priv->driver_priv; struct drm_v3d_submit_cl *args = data;

struct v3d_submit_ext se = {0}; struct v3d_bin_job *bin = NULL; struct v3d_render_job *render; struct v3d_job *clean_job = NULL;

@@ -589,7 +710,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, }

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
ret = v3d_get_extensions(file_priv, &se,
		 args->extension_count,
		 args->extensions);
if (ret) {
@@ -599,33 +720,35 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, }

render = kcalloc(1, sizeof(*render), GFP_KERNEL);

if (!render)
if (!render) {
v3d_put_multisync_post_deps(&se);
return -ENOMEM;
}

render->start = args->rcl_start; render->end = args->rcl_end; INIT_LIST_HEAD(&render->unref_list);
ret = v3d_job_init(v3d, file_priv, &render->base,
	   v3d_render_job_free, args->in_sync_rcl,
V3D_RENDER);

ret = v3d_job_init(v3d, file_priv, &render->base,

v3d_render_job_free,
	   args->in_sync_rcl, &se, V3D_RENDER);
if (ret) { kfree(render);
v3d_put_multisync_post_deps(&se);
return ret; }

if (args->bcl_start != args->bcl_end) { bin = kcalloc(1, sizeof(*bin), GFP_KERNEL); if (!bin) {
	v3d_job_put(&render->base);
	return -ENOMEM;
	ret = -ENOMEM;
	goto fail;
We are now calling v3d_job_cleanup where we would call v3d_job_put before for error codepaths. Is it safe to call this if we fail job creation before calling drm_sched_job_init like here?

The documentations says that this cleans up the resources allocated with drm_sched_job_init, which he haven't called yet.

Thanks for pointing it out. It took me a while to find the loose end, but I just realized the problem is in the current implementation.

At this point, bin is null, but render is between drm_sched_job_init and drm_sched_job_arm; so, if the bin job allocation fails, we still need to cleanup resources already allocated to the render job, because render was already initialized (v3d_job_init -> drm_sched_job_init). I'll send a fix for it.

For the bin case, I've added a check in v3d_cleanup_job() to return if the @job is null (no cleanup or put for this job). But I should also check if bin is null in the 'fail:' section (as it was previously) to avoid unexpected behavior.

...

...
}

ret = v3d_job_init(v3d, file_priv, &bin->base,
		   v3d_job_free, args->in_sync_bcl,
V3D_BIN);
		   v3d_job_free, args->in_sync_bcl,
&se, V3D_BIN); if (ret) {
	v3d_job_put(&render->base);
kfree(bin);
	return ret;
	goto fail;
}

bin->start = args->bcl_start;
@@ -643,7 +766,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, goto fail; }
ret = v3d_job_init(v3d, file_priv, clean_job,
v3d_job_free, 0, V3D_CACHE_CLEAN);
ret = v3d_job_init(v3d, file_priv, clean_job,
v3d_job_free, 0, 0, V3D_CACHE_CLEAN); if (ret) { kfree(clean_job); clean_job = NULL; @@ -706,6 +829,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, last_job, &acquire_ctx, args->out_sync,
				 &se,
			 last_job->done_fence);
if (bin)
@@ -721,11 +845,10 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, drm_gem_unlock_reservations(last_job->bo, last_job->bo_count, &acquire_ctx); fail:
if (bin)
v3d_job_cleanup(&bin->base);
v3d_job_cleanup(&bin->base); v3d_job_cleanup(&render->base);
if (clean_job)
v3d_job_cleanup(clean_job);
v3d_job_cleanup(clean_job);

v3d_put_multisync_post_deps(&se);

return ret;

} @@ -745,6 +868,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, { struct v3d_dev *v3d = to_v3d_dev(dev); struct drm_v3d_submit_tfu *args = data;

struct v3d_submit_ext se = {0}; struct v3d_tfu_job *job; struct ww_acquire_ctx acquire_ctx; int ret = 0;

@@ -757,7 +881,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, }

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
ret = v3d_get_extensions(file_priv, &se,
		 args->extension_count,
		 args->extensions);
if (ret) {
@@ -767,21 +891,24 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, }

job = kcalloc(1, sizeof(*job), GFP_KERNEL);

if (!job)
if (!job) {
v3d_put_multisync_post_deps(&se);
return -ENOMEM;
}

ret = v3d_job_init(v3d, file_priv, &job->base,
	   v3d_job_free, args->in_sync, V3D_TFU);
	   v3d_job_free, args->in_sync, &se, V3D_TFU);
if (ret) { kfree(job);
v3d_put_multisync_post_deps(&se);
return ret; }

job->base.bo = kcalloc(ARRAY_SIZE(args->bo_handles), sizeof(*job->base.bo), GFP_KERNEL); if (!job->base.bo) {
v3d_job_put(&job->base);
return -ENOMEM;
ret = -ENOMEM;
goto fail;
}

job->args = *args;
@@ -821,6 +948,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, v3d_attach_fences_and_unlock_reservation(file_priv, &job->base, &acquire_ctx, args->out_sync,
				 &se,
			 job->base.done_fence);
v3d_job_put(&job->base);
@@ -829,6 +957,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,

fail: v3d_job_cleanup(&job->base);

v3d_put_multisync_post_deps(&se);

return ret;

} @@ -849,8 +978,9 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, struct v3d_dev *v3d = to_v3d_dev(dev); struct v3d_file_priv *v3d_priv = file_priv->driver_priv; struct drm_v3d_submit_csd *args = data;

struct v3d_submit_ext se = {0}; struct v3d_csd_job *job;

struct v3d_job *clean_job;

struct v3d_job *clean_job = NULL; struct ww_acquire_ctx acquire_ctx; int ret;

@@ -867,7 +997,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, }

if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv,
ret = v3d_get_extensions(file_priv, &se,
		 args->extension_count,
		 args->extensions);
if (ret) {
@@ -877,28 +1007,29 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, }

job = kcalloc(1, sizeof(*job), GFP_KERNEL);

if (!job)
if (!job) {
v3d_put_multisync_post_deps(&se);
return -ENOMEM;
}

ret = v3d_job_init(v3d, file_priv, &job->base,
	   v3d_job_free, args->in_sync, V3D_CSD);
	   v3d_job_free, args->in_sync, &se, V3D_CSD);
if (ret) { kfree(job);
v3d_put_multisync_post_deps(&se);
return ret; }

clean_job = kcalloc(1, sizeof(*clean_job), GFP_KERNEL); if (!clean_job) {
v3d_job_put(&job->base);
kfree(job);
return -ENOMEM;
ret = -ENOMEM;
goto fail;
}
ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0,

V3D_CACHE_CLEAN);

ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0,

0, V3D_CACHE_CLEAN); if (ret) {
v3d_job_put(&job->base);
kfree(clean_job);
return ret;
goto fail;
}

job->args = *args;
@@ -936,6 +1067,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, clean_job, &acquire_ctx, args->out_sync,
				 &se,
			 clean_job-
...
done_fence);

v3d_job_put(&job->base); @@ -950,6 +1082,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, fail: v3d_job_cleanup(&job->base); v3d_job_cleanup(clean_job);

v3d_put_multisync_post_deps(&se);

return ret;

} diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 1f4706010eb5..bbb904c521b4 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -60,6 +60,42 @@ extern "C" { #define DRM_V3D_SUBMIT_CL_FLUSH_CACHE 0x01 #define DRM_V3D_SUBMIT_EXTENSION 0x02

+/* struct drm_v3d_sem - wait/signal semaphore

If binary semaphore, it only takes syncobj handle and ignores

flags and

point fields. Point is defined for timeline syncobj feature.

*/

+struct drm_v3d_sem {

__u32 handle; /* syncobj */

/* rsv below, for future uses */

__u32 flags;

__u64 point; /* for timeline sem support */

__u64 mbz[2]; /* must be zero, rsv */

+};
I guess the idea here would be that we would check handle and/or point for whether they have a valid value to decide what type of semaphore this is, right?

Yes, when syncobj timeline support is enabled, point will distinguish whether it is a timeline or binary semaphore.

Thanks for reviewing, I'll address those points.

Melissa

...

...

+/**

struct drm_v3d_multi_sync - ioctl extension to add support

multiples

syncobjs for commands submission.

When an extension of DRM_V3D_EXT_ID_MULTI_SYNC id is defined, it

points to

this extension to define wait and signal dependencies, instead of

single

in/out sync entries on submitting commands. The field flags is

used to

determine the stage to set wait dependencies.

*/

+struct drm_v3d_multi_sync {

/* Array of wait and signal semaphores */

__u64 in_syncs;

__u64 out_syncs;

/* Number of entries */

__u32 in_sync_count;

__u32 out_sync_count;

/* in_sync on render stage */

__u32 flags;

+#define DRM_V3D_IN_SYNC_RCL 0x01 +};

/* struct drm_v3d_extension - ioctl extensions

Linked-list of generic extensions where the id identify which

struct is @@ -70,6 +106,7 @@ struct drm_v3d_extension { __u64 next; __u64 ext_data; __u32 id; +#define DRM_V3D_EXT_ID_MULTI_SYNC 0x01 };

/** @@ -228,6 +265,7 @@ enum drm_v3d_param { DRM_V3D_PARAM_SUPPORTS_CSD, DRM_V3D_PARAM_SUPPORTS_CACHE_FLUSH, DRM_V3D_PARAM_SUPPORTS_PERFMON,

DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT,

};

struct drm_v3d_get_param {

1323

Age (days ago)

1352

Last active (days ago)

dri-devel@lists.freedesktop.org

10 comments

3 participants

tags (0)

participants (3)

Daniel Vetter
Iago Toral
Melissa Wen