Version 1 of context types:
https://lists.oasis-open.org/archives/virtio-dev/202108/msg00141.html
Changes since RFC: * le32 info --> {u8 ring_idx + u8 padding[3]). * Max rings is now 64.
Anthoine Bourgeois (2): drm/virtio: implement context init: probe for feature drm/virtio: implement context init: support init ioctl
Gurchetan Singh (10): virtio-gpu api: multiple context types with explicit initialization drm/virtgpu api: create context init feature drm/virtio: implement context init: track valid capabilities in a mask drm/virtio: implement context init: track {ring_idx, emit_fence_info} in virtio_gpu_fence drm/virtio: implement context init: plumb {base_fence_ctx, ring_idx} to virtio_gpu_fence_alloc drm/virtio: implement context init: stop using drv->context when creating fence drm/virtio: implement context init: allocate an array of fence contexts drm/virtio: implement context init: handle VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK drm/virtio: implement context init: add virtio_gpu_fence_event drm/virtio: implement context init: advertise feature to userspace
drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.c | 44 ++++- drivers/gpu/drm/virtio/virtgpu_drv.h | 28 +++- drivers/gpu/drm/virtio/virtgpu_fence.c | 30 +++- drivers/gpu/drm/virtio/virtgpu_ioctl.c | 195 +++++++++++++++++++++-- drivers/gpu/drm/virtio/virtgpu_kms.c | 26 ++- drivers/gpu/drm/virtio/virtgpu_plane.c | 3 +- drivers/gpu/drm/virtio/virtgpu_vq.c | 19 +-- include/uapi/drm/virtgpu_drm.h | 27 ++++ include/uapi/linux/virtio_gpu.h | 18 ++- 10 files changed, 355 insertions(+), 36 deletions(-)
This feature allows for each virtio-gpu 3D context to be created with a "context_init" variable. This variable can specify:
- the type of protocol used by the context via the capset id. This is useful for differentiating virgl, gfxstream, and venus protocols by host userspace.
- other things in the future, such as the version of the context.
In addition, each different context needs one or more timelines, so for example a virgl context's waiting can be independent on a gfxstream context's waiting.
VIRTIO_GPU_FLAG_INFO_RING_IDX is introduced to specific to tell the host which per-context command ring (or "hardware queue", distinct from the virtio-queue) the fence should be associated with.
The new capability sets (gfxstream, venus etc.) are only defined in the virtio-gpu spec and not defined in the header.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com --- include/uapi/linux/virtio_gpu.h | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/virtio_gpu.h b/include/uapi/linux/virtio_gpu.h index 97523a95781d..b0e3d91dfab7 100644 --- a/include/uapi/linux/virtio_gpu.h +++ b/include/uapi/linux/virtio_gpu.h @@ -59,6 +59,11 @@ * VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB */ #define VIRTIO_GPU_F_RESOURCE_BLOB 3 +/* + * VIRTIO_GPU_CMD_CREATE_CONTEXT with + * context_init and multiple timelines + */ +#define VIRTIO_GPU_F_CONTEXT_INIT 4
enum virtio_gpu_ctrl_type { VIRTIO_GPU_UNDEFINED = 0, @@ -122,14 +127,20 @@ enum virtio_gpu_shm_id { VIRTIO_GPU_SHM_ID_HOST_VISIBLE = 1 };
-#define VIRTIO_GPU_FLAG_FENCE (1 << 0) +#define VIRTIO_GPU_FLAG_FENCE (1 << 0) +/* + * If the following flag is set, then ring_idx contains the index + * of the command ring that needs to used when creating the fence + */ +#define VIRTIO_GPU_FLAG_INFO_RING_IDX (1 << 1)
struct virtio_gpu_ctrl_hdr { __le32 type; __le32 flags; __le64 fence_id; __le32 ctx_id; - __le32 padding; + u8 ring_idx; + u8 padding[3]; };
/* data passed in the cursor vq */ @@ -269,10 +280,11 @@ struct virtio_gpu_resource_create_3d { };
/* VIRTIO_GPU_CMD_CTX_CREATE */ +#define VIRTIO_GPU_CONTEXT_INIT_CAPSET_ID_MASK 0x000000ff struct virtio_gpu_ctx_create { struct virtio_gpu_ctrl_hdr hdr; __le32 nlen; - __le32 padding; + __le32 context_init; char debug_name[64]; };
Hi Gurchetan,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on drm-tip/drm-tip drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next linus/master v5.14 next-20210909] [cannot apply to drm/drm-next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Gurchetan-Singh/Context-types/20210... base: git://anongit.freedesktop.org/drm-intel for-linux-next config: i386-randconfig-a013-20210908 (attached as .config) compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 9c476172b93367d2cb88d7d3f4b1b5b456fa6020) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/bbe6e77e1cfa4b70c78ed31a1dde784cc52c... git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Gurchetan-Singh/Context-types/20210909-093950 git checkout bbe6e77e1cfa4b70c78ed31a1dde784cc52c68af # save the attached .config to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com
All errors (new ones prefixed by >>):
In file included from <built-in>:1:
./usr/include/linux/virtio_gpu.h:142:2: error: unknown type name 'u8'
u8 ring_idx; ^ ./usr/include/linux/virtio_gpu.h:143:2: error: unknown type name 'u8' u8 padding[3]; ^
./usr/include/linux/virtio_gpu.h:339:7: error: flexible array member 'capset_data' not allowed in otherwise empty struct
__u8 capset_data[]; ^ 3 errors generated.
--- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
This change allows creating contexts of depending on set of context parameters. The meaning of each of the parameters is listed below:
1) VIRTGPU_CONTEXT_PARAM_CAPSET_ID
This determines the type of a context based on the capability set ID. For example, the current capsets:
VIRTIO_GPU_CAPSET_VIRGL VIRTIO_GPU_CAPSET_VIRGL2
define a Gallium, TGSI based "virgl" context. We only need 1 capset ID per context type, though virgl has two due a bug that has since been fixed.
The use case is the "gfxstream" rendering library and "venus" renderer.
gfxstream doesn't do Gallium/TGSI translation and mostly relies on auto-generated API streaming. Certain users prefer gfxstream over virgl for GLES on GLES emulation. {gfxstream vk}/{venus} are also required for Vulkan emulation. The maximum capset ID is 63.
The goal is for guest userspace to choose the optimal context type depending on the situation/hardware.
2) VIRTGPU_CONTEXT_PARAM_NUM_RINGS
This tells the number of independent command rings that the context will use. This value may be zero and is inferred to be zero if VIRTGPU_CONTEXT_PARAM_NUM_RINGS is not passed in. This is for backwards compatibility for virgl, which has one big giant command ring for all commands.
The maxiumum number of rings is 64. In practice, multi-queue or multi-ring submission is used for powerful dGPUs and virtio-gpu may not be the best option in that case (see PCI passthrough or rendernode forwarding).
3) VIRTGPU_CONTEXT_PARAM_POLL_RING_IDX_MASK
This is a mask of ring indices for which the DRM fd is pollable. For example, if VIRTGPU_CONTEXT_PARAM_NUM_RINGS is 2, then the mask may be:
[ring idx] | [1 << ring_idx] | final mask ------------------------------------------- 0 1 1 1 2 3
The "Sommelier" guest Wayland proxy uses this to poll for events from the host compositor.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com Acked-by: Nicholas Verne nverne@chromium.org --- include/uapi/drm/virtgpu_drm.h | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h index b9ec26e9c646..a13e20cc66b4 100644 --- a/include/uapi/drm/virtgpu_drm.h +++ b/include/uapi/drm/virtgpu_drm.h @@ -47,12 +47,15 @@ extern "C" { #define DRM_VIRTGPU_WAIT 0x08 #define DRM_VIRTGPU_GET_CAPS 0x09 #define DRM_VIRTGPU_RESOURCE_CREATE_BLOB 0x0a +#define DRM_VIRTGPU_CONTEXT_INIT 0x0b
#define VIRTGPU_EXECBUF_FENCE_FD_IN 0x01 #define VIRTGPU_EXECBUF_FENCE_FD_OUT 0x02 +#define VIRTGPU_EXECBUF_RING_IDX 0x04 #define VIRTGPU_EXECBUF_FLAGS (\ VIRTGPU_EXECBUF_FENCE_FD_IN |\ VIRTGPU_EXECBUF_FENCE_FD_OUT |\ + VIRTGPU_EXECBUF_RING_IDX |\ 0)
struct drm_virtgpu_map { @@ -68,6 +71,8 @@ struct drm_virtgpu_execbuffer { __u64 bo_handles; __u32 num_bo_handles; __s32 fence_fd; /* in/out fence fd (see VIRTGPU_EXECBUF_FENCE_FD_IN/OUT) */ + __u32 ring_idx; /* command ring index (see VIRTGPU_EXECBUF_RING_IDX) */ + __u32 pad; };
#define VIRTGPU_PARAM_3D_FEATURES 1 /* do we have 3D features in the hw */ @@ -75,6 +80,8 @@ struct drm_virtgpu_execbuffer { #define VIRTGPU_PARAM_RESOURCE_BLOB 3 /* DRM_VIRTGPU_RESOURCE_CREATE_BLOB */ #define VIRTGPU_PARAM_HOST_VISIBLE 4 /* Host blob resources are mappable */ #define VIRTGPU_PARAM_CROSS_DEVICE 5 /* Cross virtio-device resource sharing */ +#define VIRTGPU_PARAM_CONTEXT_INIT 6 /* DRM_VIRTGPU_CONTEXT_INIT */ +#define VIRTGPU_PARAM_SUPPORTED_CAPSET_IDs 7 /* Bitmask of supported capability set ids */
struct drm_virtgpu_getparam { __u64 param; @@ -173,6 +180,22 @@ struct drm_virtgpu_resource_create_blob { __u64 blob_id; };
+#define VIRTGPU_CONTEXT_PARAM_CAPSET_ID 0x0001 +#define VIRTGPU_CONTEXT_PARAM_NUM_RINGS 0x0002 +#define VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK 0x0003 +struct drm_virtgpu_context_set_param { + __u64 param; + __u64 value; +}; + +struct drm_virtgpu_context_init { + __u32 num_params; + __u32 pad; + + /* pointer to drm_virtgpu_context_set_param array */ + __u64 ctx_set_params; +}; + #define DRM_IOCTL_VIRTGPU_MAP \ DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_MAP, struct drm_virtgpu_map)
@@ -212,6 +235,10 @@ struct drm_virtgpu_resource_create_blob { DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_RESOURCE_CREATE_BLOB, \ struct drm_virtgpu_resource_create_blob)
+#define DRM_IOCTL_VIRTGPU_CONTEXT_INIT \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_CONTEXT_INIT, \ + struct drm_virtgpu_context_init) + #if defined(__cplusplus) } #endif
The valid capability IDs are between 1 to 63, and defined in the virtio gpu spec. This is used for error checking the subsequent patches. We're currently only using 2 capability IDs, so this should be plenty for the immediate future.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com --- drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ drivers/gpu/drm/virtio/virtgpu_kms.c | 18 +++++++++++++++++- 2 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 0c4810982530..3023e16be0d6 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -55,6 +55,8 @@ #define STATE_OK 1 #define STATE_ERR 2
+#define MAX_CAPSET_ID 63 + struct virtio_gpu_object_params { unsigned long size; bool dumb; @@ -245,6 +247,7 @@ struct virtio_gpu_device {
struct virtio_gpu_drv_capset *capsets; uint32_t num_capsets; + uint64_t capset_id_mask; struct list_head cap_cache;
/* protects uuid state when exporting */ diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c b/drivers/gpu/drm/virtio/virtgpu_kms.c index f3379059f324..58a65121c200 100644 --- a/drivers/gpu/drm/virtio/virtgpu_kms.c +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c @@ -65,6 +65,7 @@ static void virtio_gpu_get_capsets(struct virtio_gpu_device *vgdev, int num_capsets) { int i, ret; + bool invalid_capset_id = false;
vgdev->capsets = kcalloc(num_capsets, sizeof(struct virtio_gpu_drv_capset), @@ -78,19 +79,34 @@ static void virtio_gpu_get_capsets(struct virtio_gpu_device *vgdev, virtio_gpu_notify(vgdev); ret = wait_event_timeout(vgdev->resp_wq, vgdev->capsets[i].id > 0, 5 * HZ); - if (ret == 0) { + /* + * Capability ids are defined in the virtio-gpu spec and are + * between 1 to 63, inclusive. + */ + if (!vgdev->capsets[i].id || + vgdev->capsets[i].id > MAX_CAPSET_ID) + invalid_capset_id = true; + + if (ret == 0) DRM_ERROR("timed out waiting for cap set %d\n", i); + else if (invalid_capset_id) + DRM_ERROR("invalid capset id %u", vgdev->capsets[i].id); + + if (ret == 0 || invalid_capset_id) { spin_lock(&vgdev->display_info_lock); kfree(vgdev->capsets); vgdev->capsets = NULL; spin_unlock(&vgdev->display_info_lock); return; } + + vgdev->capset_id_mask |= 1 << vgdev->capsets[i].id; DRM_INFO("cap set %d: id %d, max-version %d, max-size %d\n", i, vgdev->capsets[i].id, vgdev->capsets[i].max_version, vgdev->capsets[i].max_size); } + vgdev->num_capsets = num_capsets; }
From: Anthoine Bourgeois anthoine.bourgeois@gmail.com
Let's probe for VIRTIO_GPU_F_CONTEXT_INIT.
Create a new DRM_INFO(..) line since the current one is getting too long.
Signed-off-by: Anthoine Bourgeois anthoine.bourgeois@gmail.com Acked-by: Lingfeng Yang lfy@google.com --- drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.h | 1 + drivers/gpu/drm/virtio/virtgpu_kms.c | 8 +++++++- 4 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_debugfs.c b/drivers/gpu/drm/virtio/virtgpu_debugfs.c index c2b20e0ee030..b6954e2f75e6 100644 --- a/drivers/gpu/drm/virtio/virtgpu_debugfs.c +++ b/drivers/gpu/drm/virtio/virtgpu_debugfs.c @@ -52,6 +52,7 @@ static int virtio_gpu_features(struct seq_file *m, void *data) vgdev->has_resource_assign_uuid);
virtio_gpu_add_bool(m, "blob resources", vgdev->has_resource_blob); + virtio_gpu_add_bool(m, "context init", vgdev->has_context_init); virtio_gpu_add_int(m, "cap sets", vgdev->num_capsets); virtio_gpu_add_int(m, "scanouts", vgdev->num_scanouts); if (vgdev->host_visible_region.len) { diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c index ed85a7863256..9d963f1fda8f 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.c +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c @@ -172,6 +172,7 @@ static unsigned int features[] = { VIRTIO_GPU_F_EDID, VIRTIO_GPU_F_RESOURCE_UUID, VIRTIO_GPU_F_RESOURCE_BLOB, + VIRTIO_GPU_F_CONTEXT_INIT, }; static struct virtio_driver virtio_gpu_driver = { .feature_table = features, diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 3023e16be0d6..5e1958a522ff 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -236,6 +236,7 @@ struct virtio_gpu_device { bool has_resource_assign_uuid; bool has_resource_blob; bool has_host_visible; + bool has_context_init; struct virtio_shm_region host_visible_region; struct drm_mm host_visible_mm;
diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c b/drivers/gpu/drm/virtio/virtgpu_kms.c index 58a65121c200..21f410901694 100644 --- a/drivers/gpu/drm/virtio/virtgpu_kms.c +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c @@ -191,13 +191,19 @@ int virtio_gpu_init(struct drm_device *dev) (unsigned long)vgdev->host_visible_region.addr, (unsigned long)vgdev->host_visible_region.len); } + if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_CONTEXT_INIT)) { + vgdev->has_context_init = true; + }
- DRM_INFO("features: %cvirgl %cedid %cresource_blob %chost_visible\n", + DRM_INFO("features: %cvirgl %cedid %cresource_blob %chost_visible", vgdev->has_virgl_3d ? '+' : '-', vgdev->has_edid ? '+' : '-', vgdev->has_resource_blob ? '+' : '-', vgdev->has_host_visible ? '+' : '-');
+ DRM_INFO("features: %ccontext_init\n", + vgdev->has_context_init ? '+' : '-'); + ret = virtio_find_vqs(vgdev->vdev, 2, vqs, callbacks, names, NULL); if (ret) { DRM_ERROR("failed to find virt queues\n");
From: Anthoine Bourgeois anthoine.bourgeois@gmail.com
This implements the context initialization ioctl. A list of params is passed in by userspace, and kernel driver validates them. The only currently supported param is VIRTGPU_CONTEXT_PARAM_CAPSET_ID.
If the context has already been initialized, -EEXIST is returned. This happens after Linux userspace does dumb_create + followed by opening the Mesa virgl driver with the same virtgpu instance.
However, for most applications, 3D contexts will be explicitly initialized when the feature is available.
Signed-off-by: Anthoine Bourgeois anthoine.bourgeois@gmail.com Acked-by: Lingfeng Yang lfy@google.com --- drivers/gpu/drm/virtio/virtgpu_drv.h | 6 +- drivers/gpu/drm/virtio/virtgpu_ioctl.c | 96 ++++++++++++++++++++++++-- drivers/gpu/drm/virtio/virtgpu_vq.c | 4 +- 3 files changed, 98 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 5e1958a522ff..9996abf60e3a 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -259,12 +259,13 @@ struct virtio_gpu_device {
struct virtio_gpu_fpriv { uint32_t ctx_id; + uint32_t context_init; bool context_created; struct mutex context_lock; };
/* virtgpu_ioctl.c */ -#define DRM_VIRTIO_NUM_IOCTLS 11 +#define DRM_VIRTIO_NUM_IOCTLS 12 extern struct drm_ioctl_desc virtio_gpu_ioctls[DRM_VIRTIO_NUM_IOCTLS]; void virtio_gpu_create_context(struct drm_device *dev, struct drm_file *file);
@@ -342,7 +343,8 @@ int virtio_gpu_cmd_get_capset(struct virtio_gpu_device *vgdev, struct virtio_gpu_drv_cap_cache **cache_p); int virtio_gpu_cmd_get_edids(struct virtio_gpu_device *vgdev); void virtio_gpu_cmd_context_create(struct virtio_gpu_device *vgdev, uint32_t id, - uint32_t nlen, const char *name); + uint32_t context_init, uint32_t nlen, + const char *name); void virtio_gpu_cmd_context_destroy(struct virtio_gpu_device *vgdev, uint32_t id); void virtio_gpu_cmd_context_attach_resource(struct virtio_gpu_device *vgdev, diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index 5c1ad1596889..f5281d1e30e1 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -38,20 +38,30 @@ VIRTGPU_BLOB_FLAG_USE_SHAREABLE | \ VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE)
+/* Must be called with &virtio_gpu_fpriv.struct_mutex held. */ +static void virtio_gpu_create_context_locked(struct virtio_gpu_device *vgdev, + struct virtio_gpu_fpriv *vfpriv) +{ + char dbgname[TASK_COMM_LEN]; + + get_task_comm(dbgname, current); + virtio_gpu_cmd_context_create(vgdev, vfpriv->ctx_id, + vfpriv->context_init, strlen(dbgname), + dbgname); + + vfpriv->context_created = true; +} + void virtio_gpu_create_context(struct drm_device *dev, struct drm_file *file) { struct virtio_gpu_device *vgdev = dev->dev_private; struct virtio_gpu_fpriv *vfpriv = file->driver_priv; - char dbgname[TASK_COMM_LEN];
mutex_lock(&vfpriv->context_lock); if (vfpriv->context_created) goto out_unlock;
- get_task_comm(dbgname, current); - virtio_gpu_cmd_context_create(vgdev, vfpriv->ctx_id, - strlen(dbgname), dbgname); - vfpriv->context_created = true; + virtio_gpu_create_context_locked(vgdev, vfpriv);
out_unlock: mutex_unlock(&vfpriv->context_lock); @@ -662,6 +672,79 @@ static int virtio_gpu_resource_create_blob_ioctl(struct drm_device *dev, return 0; }
+static int virtio_gpu_context_init_ioctl(struct drm_device *dev, + void *data, struct drm_file *file) +{ + int ret = 0; + uint32_t num_params, i, param, value; + size_t len; + struct drm_virtgpu_context_set_param *ctx_set_params = NULL; + struct virtio_gpu_device *vgdev = dev->dev_private; + struct virtio_gpu_fpriv *vfpriv = file->driver_priv; + struct drm_virtgpu_context_init *args = data; + + num_params = args->num_params; + len = num_params * sizeof(struct drm_virtgpu_context_set_param); + + if (!vgdev->has_context_init || !vgdev->has_virgl_3d) + return -EINVAL; + + /* Number of unique parameters supported at this time. */ + if (num_params > 1) + return -EINVAL; + + ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params), + len); + + if (IS_ERR(ctx_set_params)) + return PTR_ERR(ctx_set_params); + + mutex_lock(&vfpriv->context_lock); + if (vfpriv->context_created) { + ret = -EEXIST; + goto out_unlock; + } + + for (i = 0; i < num_params; i++) { + param = ctx_set_params[i].param; + value = ctx_set_params[i].value; + + switch (param) { + case VIRTGPU_CONTEXT_PARAM_CAPSET_ID: + if (value > MAX_CAPSET_ID) { + ret = -EINVAL; + goto out_unlock; + } + + if ((vgdev->capset_id_mask & (1 << value)) == 0) { + ret = -EINVAL; + goto out_unlock; + } + + /* Context capset ID already set */ + if (vfpriv->context_init & + VIRTIO_GPU_CONTEXT_INIT_CAPSET_ID_MASK) { + ret = -EINVAL; + goto out_unlock; + } + + vfpriv->context_init |= value; + break; + default: + ret = -EINVAL; + goto out_unlock; + } + } + + virtio_gpu_create_context_locked(vgdev, vfpriv); + virtio_gpu_notify(vgdev); + +out_unlock: + mutex_unlock(&vfpriv->context_lock); + kfree(ctx_set_params); + return ret; +} + struct drm_ioctl_desc virtio_gpu_ioctls[DRM_VIRTIO_NUM_IOCTLS] = { DRM_IOCTL_DEF_DRV(VIRTGPU_MAP, virtio_gpu_map_ioctl, DRM_RENDER_ALLOW), @@ -698,4 +781,7 @@ struct drm_ioctl_desc virtio_gpu_ioctls[DRM_VIRTIO_NUM_IOCTLS] = { DRM_IOCTL_DEF_DRV(VIRTGPU_RESOURCE_CREATE_BLOB, virtio_gpu_resource_create_blob_ioctl, DRM_RENDER_ALLOW), + + DRM_IOCTL_DEF_DRV(VIRTGPU_CONTEXT_INIT, virtio_gpu_context_init_ioctl, + DRM_RENDER_ALLOW), }; diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c b/drivers/gpu/drm/virtio/virtgpu_vq.c index 2e71e91278b4..496f8ce4cd41 100644 --- a/drivers/gpu/drm/virtio/virtgpu_vq.c +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c @@ -917,7 +917,8 @@ int virtio_gpu_cmd_get_edids(struct virtio_gpu_device *vgdev) }
void virtio_gpu_cmd_context_create(struct virtio_gpu_device *vgdev, uint32_t id, - uint32_t nlen, const char *name) + uint32_t context_init, uint32_t nlen, + const char *name) { struct virtio_gpu_ctx_create *cmd_p; struct virtio_gpu_vbuffer *vbuf; @@ -928,6 +929,7 @@ void virtio_gpu_cmd_context_create(struct virtio_gpu_device *vgdev, uint32_t id, cmd_p->hdr.type = cpu_to_le32(VIRTIO_GPU_CMD_CTX_CREATE); cmd_p->hdr.ctx_id = cpu_to_le32(id); cmd_p->nlen = cpu_to_le32(nlen); + cmd_p->context_init = cpu_to_le32(context_init); strncpy(cmd_p->debug_name, name, sizeof(cmd_p->debug_name) - 1); cmd_p->debug_name[sizeof(cmd_p->debug_name) - 1] = 0; virtio_gpu_queue_ctrl_buffer(vgdev, vbuf);
Each fence should be associated with a [fence ID, fence_context, seqno]. The seqno number is just the fence id.
To get the fence context, we add the ring_idx to the 3D context's base_fence_ctx. The ring_idx is between 0 and 31, inclusive.
Each 3D context will have it's own base_fence_ctx. The ring_idx will be emitted to host userspace, when emit_fence_info is true.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com --- drivers/gpu/drm/virtio/virtgpu_drv.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 9996abf60e3a..401aec1a5efb 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -139,7 +139,9 @@ struct virtio_gpu_fence_driver {
struct virtio_gpu_fence { struct dma_fence f; + uint32_t ring_idx; uint64_t fence_id; + bool emit_fence_info; struct virtio_gpu_fence_driver *drv; struct list_head node; };
These were defined in the previous commit. We'll need these parameters when allocating a dma_fence. The use case for this is multiple synchronizations timelines.
The maximum number of timelines per 3D instance will be 32. Usually, only 2 are needed -- one for CPU commands, and another for GPU commands.
As such, we'll need to specify these parameters when allocating a dma_fence.
vgdev->fence_drv.context is the "default" fence context for 2D mode and old userspace.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com --- drivers/gpu/drm/virtio/virtgpu_drv.h | 5 +++-- drivers/gpu/drm/virtio/virtgpu_fence.c | 4 +++- drivers/gpu/drm/virtio/virtgpu_ioctl.c | 9 +++++---- drivers/gpu/drm/virtio/virtgpu_plane.c | 3 ++- 4 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 401aec1a5efb..a5142d60c2fa 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -426,8 +426,9 @@ struct drm_plane *virtio_gpu_plane_init(struct virtio_gpu_device *vgdev, int index);
/* virtgpu_fence.c */ -struct virtio_gpu_fence *virtio_gpu_fence_alloc( - struct virtio_gpu_device *vgdev); +struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device *vgdev, + uint64_t base_fence_ctx, + uint32_t ring_idx); void virtio_gpu_fence_emit(struct virtio_gpu_device *vgdev, struct virtio_gpu_ctrl_hdr *cmd_hdr, struct virtio_gpu_fence *fence); diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c b/drivers/gpu/drm/virtio/virtgpu_fence.c index d28e25e8409b..24c728b65d21 100644 --- a/drivers/gpu/drm/virtio/virtgpu_fence.c +++ b/drivers/gpu/drm/virtio/virtgpu_fence.c @@ -71,7 +71,9 @@ static const struct dma_fence_ops virtio_gpu_fence_ops = { .timeline_value_str = virtio_gpu_timeline_value_str, };
-struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device *vgdev) +struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device *vgdev, + uint64_t base_fence_ctx, + uint32_t ring_idx) { struct virtio_gpu_fence_driver *drv = &vgdev->fence_drv; struct virtio_gpu_fence *fence = kzalloc(sizeof(struct virtio_gpu_fence), diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index f5281d1e30e1..f51f3393a194 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -173,7 +173,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, goto out_memdup; }
- out_fence = virtio_gpu_fence_alloc(vgdev); + out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0); if(!out_fence) { ret = -ENOMEM; goto out_unresv; @@ -288,7 +288,7 @@ static int virtio_gpu_resource_create_ioctl(struct drm_device *dev, void *data, if (params.size == 0) params.size = PAGE_SIZE;
- fence = virtio_gpu_fence_alloc(vgdev); + fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0); if (!fence) return -ENOMEM; ret = virtio_gpu_object_create(vgdev, ¶ms, &qobj, fence); @@ -367,7 +367,7 @@ static int virtio_gpu_transfer_from_host_ioctl(struct drm_device *dev, if (ret != 0) goto err_put_free;
- fence = virtio_gpu_fence_alloc(vgdev); + fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0); if (!fence) { ret = -ENOMEM; goto err_unlock; @@ -427,7 +427,8 @@ static int virtio_gpu_transfer_to_host_ioctl(struct drm_device *dev, void *data, goto err_put_free;
ret = -ENOMEM; - fence = virtio_gpu_fence_alloc(vgdev); + fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, + 0); if (!fence) goto err_unlock;
diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c index a49fd9480381..6d3cc9e238a4 100644 --- a/drivers/gpu/drm/virtio/virtgpu_plane.c +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c @@ -256,7 +256,8 @@ static int virtio_gpu_plane_prepare_fb(struct drm_plane *plane, return 0;
if (bo->dumb && (plane->state->fb != new_state->fb)) { - vgfb->fence = virtio_gpu_fence_alloc(vgdev); + vgfb->fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, + 0); if (!vgfb->fence) return -ENOMEM; }
The plumbing is all here to do this. Since we always use the default fence context when allocating a fence, this makes no functional difference.
We can't process just the largest fence id anymore, since it's it's associated with different timelines. It's fine for fence_id 260 to signal before 259. As such, process each fence_id individually.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com --- drivers/gpu/drm/virtio/virtgpu_fence.c | 16 ++++++++++++++-- drivers/gpu/drm/virtio/virtgpu_vq.c | 15 +++------------ 2 files changed, 17 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c b/drivers/gpu/drm/virtio/virtgpu_fence.c index 24c728b65d21..98a00c1e654d 100644 --- a/drivers/gpu/drm/virtio/virtgpu_fence.c +++ b/drivers/gpu/drm/virtio/virtgpu_fence.c @@ -75,20 +75,25 @@ struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device *vgdev, uint64_t base_fence_ctx, uint32_t ring_idx) { + uint64_t fence_context = base_fence_ctx + ring_idx; struct virtio_gpu_fence_driver *drv = &vgdev->fence_drv; struct virtio_gpu_fence *fence = kzalloc(sizeof(struct virtio_gpu_fence), GFP_KERNEL); + if (!fence) return fence;
fence->drv = drv; + fence->ring_idx = ring_idx; + fence->emit_fence_info = !(base_fence_ctx == drv->context);
/* This only partially initializes the fence because the seqno is * unknown yet. The fence must not be used outside of the driver * until virtio_gpu_fence_emit is called. */ - dma_fence_init(&fence->f, &virtio_gpu_fence_ops, &drv->lock, drv->context, - 0); + + dma_fence_init(&fence->f, &virtio_gpu_fence_ops, &drv->lock, + fence_context, 0);
return fence; } @@ -110,6 +115,13 @@ void virtio_gpu_fence_emit(struct virtio_gpu_device *vgdev,
cmd_hdr->flags |= cpu_to_le32(VIRTIO_GPU_FLAG_FENCE); cmd_hdr->fence_id = cpu_to_le64(fence->fence_id); + + /* Only currently defined fence param. */ + if (fence->emit_fence_info) { + cmd_hdr->flags |= + cpu_to_le32(VIRTIO_GPU_FLAG_INFO_RING_IDX); + cmd_hdr->ring_idx = (u8)fence->ring_idx; + } }
void virtio_gpu_fence_event_process(struct virtio_gpu_device *vgdev, diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c b/drivers/gpu/drm/virtio/virtgpu_vq.c index 496f8ce4cd41..938331554632 100644 --- a/drivers/gpu/drm/virtio/virtgpu_vq.c +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c @@ -205,7 +205,7 @@ void virtio_gpu_dequeue_ctrl_func(struct work_struct *work) struct list_head reclaim_list; struct virtio_gpu_vbuffer *entry, *tmp; struct virtio_gpu_ctrl_hdr *resp; - u64 fence_id = 0; + u64 fence_id;
INIT_LIST_HEAD(&reclaim_list); spin_lock(&vgdev->ctrlq.qlock); @@ -232,23 +232,14 @@ void virtio_gpu_dequeue_ctrl_func(struct work_struct *work) DRM_DEBUG("response 0x%x\n", le32_to_cpu(resp->type)); } if (resp->flags & cpu_to_le32(VIRTIO_GPU_FLAG_FENCE)) { - u64 f = le64_to_cpu(resp->fence_id); - - if (fence_id > f) { - DRM_ERROR("%s: Oops: fence %llx -> %llx\n", - __func__, fence_id, f); - } else { - fence_id = f; - } + fence_id = le64_to_cpu(resp->fence_id); + virtio_gpu_fence_event_process(vgdev, fence_id); } if (entry->resp_cb) entry->resp_cb(vgdev, entry); } wake_up(&vgdev->ctrlq.ack_queue);
- if (fence_id) - virtio_gpu_fence_event_process(vgdev, fence_id); - list_for_each_entry_safe(entry, tmp, &reclaim_list, list) { if (entry->objs) virtio_gpu_array_put_free_delayed(vgdev, entry->objs);
On Wed, Sep 08, 2021 at 06:37:13PM -0700, Gurchetan Singh wrote:
The plumbing is all here to do this. Since we always use the default fence context when allocating a fence, this makes no functional difference.
We can't process just the largest fence id anymore, since it's it's associated with different timelines. It's fine for fence_id 260 to signal before 259. As such, process each fence_id individually.
I guess you need to also update virtio_gpu_fence_event_process() then? It currently has the strict ordering logic baked in ...
take care, Gerd
On Tue, Sep 14, 2021 at 10:53 PM Gerd Hoffmann kraxel@redhat.com wrote:
On Wed, Sep 08, 2021 at 06:37:13PM -0700, Gurchetan Singh wrote:
The plumbing is all here to do this. Since we always use the default fence context when allocating a fence, this makes no functional difference.
We can't process just the largest fence id anymore, since it's it's associated with different timelines. It's fine for fence_id 260 to signal before 259. As such, process each fence_id individually.
I guess you need to also update virtio_gpu_fence_event_process() then? It currently has the strict ordering logic baked in ...
The update to virtio_gpu_fence_event_process was done as a preparation a few months back:
https://cgit.freedesktop.org/drm/drm-misc/commit/drivers/gpu/drm/virtio/virt...
take care, Gerd
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
Hi,
I guess you need to also update virtio_gpu_fence_event_process() then? It currently has the strict ordering logic baked in ...
The update to virtio_gpu_fence_event_process was done as a preparation a few months back:
https://cgit.freedesktop.org/drm/drm-misc/commit/drivers/gpu/drm/virtio/virt...
Ah, ok, missed the detail that the context check is already there.
thanks, Gerd
We don't want fences from different 3D contexts (virgl, gfxstream, venus) to be on the same timeline. With explicit context creation, we can specify the number of ring each context wants.
Execbuffer can specify which ring to use.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com --- drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 ++++++++++++++++++++++++-- 2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index a5142d60c2fa..cca9ab505deb 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -56,6 +56,7 @@ #define STATE_ERR 2
#define MAX_CAPSET_ID 63 +#define MAX_RINGS 64
struct virtio_gpu_object_params { unsigned long size; @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { uint32_t ctx_id; uint32_t context_init; bool context_created; + uint32_t num_rings; + uint64_t base_fence_ctx; struct mutex context_lock; };
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index f51f3393a194..262f79210283 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, int in_fence_fd = exbuf->fence_fd; int out_fence_fd = -1; void *buf; + uint64_t fence_ctx; + uint32_t ring_idx; + + fence_ctx = vgdev->fence_drv.context; + ring_idx = 0;
if (vgdev->has_virgl_3d == false) return -ENOSYS; @@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) return -EINVAL;
+ if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) { + if (exbuf->ring_idx >= vfpriv->num_rings) + return -EINVAL; + + if (!vfpriv->base_fence_ctx) + return -EINVAL; + + fence_ctx = vfpriv->base_fence_ctx; + ring_idx = exbuf->ring_idx; + } + exbuf->fence_fd = -1;
virtio_gpu_create_context(dev, file); @@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, goto out_memdup; }
- out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0); + out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx); if(!out_fence) { ret = -ENOMEM; goto out_unresv; @@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, return -EINVAL;
/* Number of unique parameters supported at this time. */ - if (num_params > 1) + if (num_params > 2) return -EINVAL;
ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params), @@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev,
vfpriv->context_init |= value; break; + case VIRTGPU_CONTEXT_PARAM_NUM_RINGS: + if (vfpriv->base_fence_ctx) { + ret = -EINVAL; + goto out_unlock; + } + + if (value > MAX_RINGS) { + ret = -EINVAL; + goto out_unlock; + } + + vfpriv->base_fence_ctx = dma_fence_context_alloc(value); + vfpriv->num_rings = value; + break; default: ret = -EINVAL; goto out_unlock;
On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
We don't want fences from different 3D contexts (virgl, gfxstream, venus) to be on the same timeline. With explicit context creation, we can specify the number of ring each context wants.
Execbuffer can specify which ring to use.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com
drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 ++++++++++++++++++++++++-- 2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index a5142d60c2fa..cca9ab505deb 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -56,6 +56,7 @@ #define STATE_ERR 2
#define MAX_CAPSET_ID 63 +#define MAX_RINGS 64
struct virtio_gpu_object_params { unsigned long size; @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { uint32_t ctx_id; uint32_t context_init; bool context_created;
uint32_t num_rings;
uint64_t base_fence_ctx; struct mutex context_lock;
};
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index f51f3393a194..262f79210283 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, int in_fence_fd = exbuf->fence_fd; int out_fence_fd = -1; void *buf;
uint64_t fence_ctx;
uint32_t ring_idx;
fence_ctx = vgdev->fence_drv.context;
ring_idx = 0; if (vgdev->has_virgl_3d == false) return -ENOSYS;
@@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) return -EINVAL;
if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
if (exbuf->ring_idx >= vfpriv->num_rings)
return -EINVAL;
if (!vfpriv->base_fence_ctx)
return -EINVAL;
fence_ctx = vfpriv->base_fence_ctx;
ring_idx = exbuf->ring_idx;
}
exbuf->fence_fd = -1; virtio_gpu_create_context(dev, file);
@@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, goto out_memdup; }
out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0);
out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx); if(!out_fence) { ret = -ENOMEM; goto out_unresv;
@@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, return -EINVAL;
/* Number of unique parameters supported at this time. */
if (num_params > 1)
if (num_params > 2) return -EINVAL; ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params),
@@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev,
vfpriv->context_init |= value; break;
case VIRTGPU_CONTEXT_PARAM_NUM_RINGS:
if (vfpriv->base_fence_ctx) {
ret = -EINVAL;
goto out_unlock;
}
if (value > MAX_RINGS) {
ret = -EINVAL;
goto out_unlock;
}
vfpriv->base_fence_ctx = dma_fence_context_alloc(value);
With multiple fence contexts, we should do something about implicit fencing.
The classic example is Mesa and X server. When both use virgl and the global fence context, no dma_fence_wait is fine. But when Mesa uses venus and the ring fence context, dma_fence_wait should be inserted.
vfpriv->num_rings = value;
break; default: ret = -EINVAL; goto out_unlock;
-- 2.33.0.153.gba50c8fa24-goog
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
On Fri, Sep 10, 2021 at 12:33 PM Chia-I Wu olvaffe@gmail.com wrote:
On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
We don't want fences from different 3D contexts (virgl, gfxstream, venus) to be on the same timeline. With explicit context creation, we can specify the number of ring each context wants.
Execbuffer can specify which ring to use.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com
drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 ++++++++++++++++++++++++-- 2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index a5142d60c2fa..cca9ab505deb 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -56,6 +56,7 @@ #define STATE_ERR 2
#define MAX_CAPSET_ID 63 +#define MAX_RINGS 64
struct virtio_gpu_object_params { unsigned long size; @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { uint32_t ctx_id; uint32_t context_init; bool context_created;
uint32_t num_rings;
uint64_t base_fence_ctx; struct mutex context_lock;
};
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index f51f3393a194..262f79210283 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct
drm_device *dev, void *data,
int in_fence_fd = exbuf->fence_fd; int out_fence_fd = -1; void *buf;
uint64_t fence_ctx;
uint32_t ring_idx;
fence_ctx = vgdev->fence_drv.context;
ring_idx = 0; if (vgdev->has_virgl_3d == false) return -ENOSYS;
@@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct
drm_device *dev, void *data,
if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) return -EINVAL;
if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
if (exbuf->ring_idx >= vfpriv->num_rings)
return -EINVAL;
if (!vfpriv->base_fence_ctx)
return -EINVAL;
fence_ctx = vfpriv->base_fence_ctx;
ring_idx = exbuf->ring_idx;
}
exbuf->fence_fd = -1; virtio_gpu_create_context(dev, file);
@@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct
drm_device *dev, void *data,
goto out_memdup; }
out_fence = virtio_gpu_fence_alloc(vgdev,
vgdev->fence_drv.context, 0);
out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx); if(!out_fence) { ret = -ENOMEM; goto out_unresv;
@@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct
drm_device *dev,
return -EINVAL; /* Number of unique parameters supported at this time. */
if (num_params > 1)
if (num_params > 2) return -EINVAL; ctx_set_params =
memdup_user(u64_to_user_ptr(args->ctx_set_params),
@@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct
drm_device *dev,
vfpriv->context_init |= value; break;
case VIRTGPU_CONTEXT_PARAM_NUM_RINGS:
if (vfpriv->base_fence_ctx) {
ret = -EINVAL;
goto out_unlock;
}
if (value > MAX_RINGS) {
ret = -EINVAL;
goto out_unlock;
}
vfpriv->base_fence_ctx =
dma_fence_context_alloc(value); With multiple fence contexts, we should do something about implicit fencing.
The classic example is Mesa and X server. When both use virgl and the global fence context, no dma_fence_wait is fine. But when Mesa uses venus and the ring fence context, dma_fence_wait should be inserted.
If I read your comment correctly, the use case is:
context A (venus)
sharing a render target with
context B (Xserver backed virgl)
?
Which function do you envisage dma_fence_wait(...) to be inserted? Doesn't implicit synchronization mean there's no fence to share between contexts (only buffer objects)?
It may be possible to wait on the reservation object associated with a buffer object from a different context (userspace can also do DRM_IOCTL_VIRTGPU_WAIT), but not sure if that's what you're looking for.
vfpriv->num_rings = value;
break; default: ret = -EINVAL; goto out_unlock;
-- 2.33.0.153.gba50c8fa24-goog
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
.
On Mon, Sep 13, 2021 at 10:48 AM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Fri, Sep 10, 2021 at 12:33 PM Chia-I Wu olvaffe@gmail.com wrote:
On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
We don't want fences from different 3D contexts (virgl, gfxstream, venus) to be on the same timeline. With explicit context creation, we can specify the number of ring each context wants.
Execbuffer can specify which ring to use.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com
drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 ++++++++++++++++++++++++-- 2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index a5142d60c2fa..cca9ab505deb 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -56,6 +56,7 @@ #define STATE_ERR 2
#define MAX_CAPSET_ID 63 +#define MAX_RINGS 64
struct virtio_gpu_object_params { unsigned long size; @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { uint32_t ctx_id; uint32_t context_init; bool context_created;
uint32_t num_rings;
uint64_t base_fence_ctx; struct mutex context_lock;
};
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index f51f3393a194..262f79210283 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, int in_fence_fd = exbuf->fence_fd; int out_fence_fd = -1; void *buf;
uint64_t fence_ctx;
uint32_t ring_idx;
fence_ctx = vgdev->fence_drv.context;
ring_idx = 0; if (vgdev->has_virgl_3d == false) return -ENOSYS;
@@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) return -EINVAL;
if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
if (exbuf->ring_idx >= vfpriv->num_rings)
return -EINVAL;
if (!vfpriv->base_fence_ctx)
return -EINVAL;
fence_ctx = vfpriv->base_fence_ctx;
ring_idx = exbuf->ring_idx;
}
exbuf->fence_fd = -1; virtio_gpu_create_context(dev, file);
@@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, goto out_memdup; }
out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0);
out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx); if(!out_fence) { ret = -ENOMEM; goto out_unresv;
@@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, return -EINVAL;
/* Number of unique parameters supported at this time. */
if (num_params > 1)
if (num_params > 2) return -EINVAL; ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params),
@@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev,
vfpriv->context_init |= value; break;
case VIRTGPU_CONTEXT_PARAM_NUM_RINGS:
if (vfpriv->base_fence_ctx) {
ret = -EINVAL;
goto out_unlock;
}
if (value > MAX_RINGS) {
ret = -EINVAL;
goto out_unlock;
}
vfpriv->base_fence_ctx = dma_fence_context_alloc(value);
With multiple fence contexts, we should do something about implicit fencing.
The classic example is Mesa and X server. When both use virgl and the global fence context, no dma_fence_wait is fine. But when Mesa uses venus and the ring fence context, dma_fence_wait should be inserted.
If I read your comment correctly, the use case is:
context A (venus)
sharing a render target with
context B (Xserver backed virgl)
?
Which function do you envisage dma_fence_wait(...) to be inserted? Doesn't implicit synchronization mean there's no fence to share between contexts (only buffer objects)?
Fences can be implicitly shared via reservation objects associated with buffer objects.
It may be possible to wait on the reservation object associated with a buffer object from a different context (userspace can also do DRM_IOCTL_VIRTGPU_WAIT), but not sure if that's what you're looking for.
Right, that's what I am looking for. Userspace expects implicit fencing to work. While there are works to move the userspace to do explicit fencing, it is not there yet in general and we can't require the userspace to do explicit fencing or DRM_IOCTL_VIRTGPU_WAIT.
vfpriv->num_rings = value;
break; default: ret = -EINVAL; goto out_unlock;
-- 2.33.0.153.gba50c8fa24-goog
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
On Mon, Sep 13, 2021 at 11:52 AM Chia-I Wu olvaffe@gmail.com wrote:
.
On Mon, Sep 13, 2021 at 10:48 AM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Fri, Sep 10, 2021 at 12:33 PM Chia-I Wu olvaffe@gmail.com wrote:
On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
We don't want fences from different 3D contexts (virgl, gfxstream, venus) to be on the same timeline. With explicit context creation, we can specify the number of ring each context wants.
Execbuffer can specify which ring to use.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com
drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34
++++++++++++++++++++++++--
2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index a5142d60c2fa..cca9ab505deb 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -56,6 +56,7 @@ #define STATE_ERR 2
#define MAX_CAPSET_ID 63 +#define MAX_RINGS 64
struct virtio_gpu_object_params { unsigned long size; @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { uint32_t ctx_id; uint32_t context_init; bool context_created;
uint32_t num_rings;
uint64_t base_fence_ctx; struct mutex context_lock;
};
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index f51f3393a194..262f79210283 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct
drm_device *dev, void *data,
int in_fence_fd = exbuf->fence_fd; int out_fence_fd = -1; void *buf;
uint64_t fence_ctx;
uint32_t ring_idx;
fence_ctx = vgdev->fence_drv.context;
ring_idx = 0; if (vgdev->has_virgl_3d == false) return -ENOSYS;
@@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct
drm_device *dev, void *data,
if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) return -EINVAL;
if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
if (exbuf->ring_idx >= vfpriv->num_rings)
return -EINVAL;
if (!vfpriv->base_fence_ctx)
return -EINVAL;
fence_ctx = vfpriv->base_fence_ctx;
ring_idx = exbuf->ring_idx;
}
exbuf->fence_fd = -1; virtio_gpu_create_context(dev, file);
@@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct
drm_device *dev, void *data,
goto out_memdup; }
out_fence = virtio_gpu_fence_alloc(vgdev,
vgdev->fence_drv.context, 0);
out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx,
ring_idx);
if(!out_fence) { ret = -ENOMEM; goto out_unresv;
@@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct
drm_device *dev,
return -EINVAL; /* Number of unique parameters supported at this time. */
if (num_params > 1)
if (num_params > 2) return -EINVAL; ctx_set_params =
memdup_user(u64_to_user_ptr(args->ctx_set_params),
@@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct
drm_device *dev,
vfpriv->context_init |= value; break;
case VIRTGPU_CONTEXT_PARAM_NUM_RINGS:
if (vfpriv->base_fence_ctx) {
ret = -EINVAL;
goto out_unlock;
}
if (value > MAX_RINGS) {
ret = -EINVAL;
goto out_unlock;
}
vfpriv->base_fence_ctx =
dma_fence_context_alloc(value);
With multiple fence contexts, we should do something about implicit
fencing.
The classic example is Mesa and X server. When both use virgl and the global fence context, no dma_fence_wait is fine. But when Mesa uses venus and the ring fence context, dma_fence_wait should be inserted.
If I read your comment correctly, the use case is:
context A (venus)
sharing a render target with
context B (Xserver backed virgl)
?
Which function do you envisage dma_fence_wait(...) to be inserted?
Doesn't implicit synchronization mean there's no fence to share between contexts (only buffer objects)?
Fences can be implicitly shared via reservation objects associated with buffer objects.
It may be possible to wait on the reservation object associated with a
buffer object from a different context (userspace can also do DRM_IOCTL_VIRTGPU_WAIT), but not sure if that's what you're looking for.
Right, that's what I am looking for. Userspace expects implicit fencing to work. While there are works to move the userspace to do explicit fencing, it is not there yet in general and we can't require the userspace to do explicit fencing or DRM_IOCTL_VIRTGPU_WAIT.
Another option would be to use the upcoming DMA_BUF_IOCTL_EXPORT_SYNC_FILE + VIRTGPU_EXECBUF_FENCE_FD_IN (which checks the dma_fence context).
Generally, if it only requires virgl changes, userspace changes are fine since OpenGL drivers implement implicit sync in many ways. Waiting on the reservation object in the kernel is fine too though.
Though venus doesn't use the NUM_RINGS param yet. Getting all permutations of context type + display integration working would take some time (patchset mostly tested with wayland + gfxstream/Android [no implicit sync]).
WDYT of someone figuring out virgl/venus interop later, independently of this patchset?
vfpriv->num_rings = value;
break; default: ret = -EINVAL; goto out_unlock;
-- 2.33.0.153.gba50c8fa24-goog
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
,On Mon, Sep 13, 2021 at 6:57 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Mon, Sep 13, 2021 at 11:52 AM Chia-I Wu olvaffe@gmail.com wrote:
.
On Mon, Sep 13, 2021 at 10:48 AM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Fri, Sep 10, 2021 at 12:33 PM Chia-I Wu olvaffe@gmail.com wrote:
On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
We don't want fences from different 3D contexts (virgl, gfxstream, venus) to be on the same timeline. With explicit context creation, we can specify the number of ring each context wants.
Execbuffer can specify which ring to use.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com
drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 ++++++++++++++++++++++++-- 2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index a5142d60c2fa..cca9ab505deb 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -56,6 +56,7 @@ #define STATE_ERR 2
#define MAX_CAPSET_ID 63 +#define MAX_RINGS 64
struct virtio_gpu_object_params { unsigned long size; @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { uint32_t ctx_id; uint32_t context_init; bool context_created;
uint32_t num_rings;
uint64_t base_fence_ctx; struct mutex context_lock;
};
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index f51f3393a194..262f79210283 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, int in_fence_fd = exbuf->fence_fd; int out_fence_fd = -1; void *buf;
uint64_t fence_ctx;
uint32_t ring_idx;
fence_ctx = vgdev->fence_drv.context;
ring_idx = 0; if (vgdev->has_virgl_3d == false) return -ENOSYS;
@@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) return -EINVAL;
if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
if (exbuf->ring_idx >= vfpriv->num_rings)
return -EINVAL;
if (!vfpriv->base_fence_ctx)
return -EINVAL;
fence_ctx = vfpriv->base_fence_ctx;
ring_idx = exbuf->ring_idx;
}
exbuf->fence_fd = -1; virtio_gpu_create_context(dev, file);
@@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, goto out_memdup; }
out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0);
out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx); if(!out_fence) { ret = -ENOMEM; goto out_unresv;
@@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, return -EINVAL;
/* Number of unique parameters supported at this time. */
if (num_params > 1)
if (num_params > 2) return -EINVAL; ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params),
@@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev,
vfpriv->context_init |= value; break;
case VIRTGPU_CONTEXT_PARAM_NUM_RINGS:
if (vfpriv->base_fence_ctx) {
ret = -EINVAL;
goto out_unlock;
}
if (value > MAX_RINGS) {
ret = -EINVAL;
goto out_unlock;
}
vfpriv->base_fence_ctx = dma_fence_context_alloc(value);
With multiple fence contexts, we should do something about implicit fencing.
The classic example is Mesa and X server. When both use virgl and the global fence context, no dma_fence_wait is fine. But when Mesa uses venus and the ring fence context, dma_fence_wait should be inserted.
If I read your comment correctly, the use case is:
context A (venus)
sharing a render target with
context B (Xserver backed virgl)
?
Which function do you envisage dma_fence_wait(...) to be inserted? Doesn't implicit synchronization mean there's no fence to share between contexts (only buffer objects)?
Fences can be implicitly shared via reservation objects associated with buffer objects.
It may be possible to wait on the reservation object associated with a buffer object from a different context (userspace can also do DRM_IOCTL_VIRTGPU_WAIT), but not sure if that's what you're looking for.
Right, that's what I am looking for. Userspace expects implicit fencing to work. While there are works to move the userspace to do explicit fencing, it is not there yet in general and we can't require the userspace to do explicit fencing or DRM_IOCTL_VIRTGPU_WAIT.
Another option would be to use the upcoming DMA_BUF_IOCTL_EXPORT_SYNC_FILE + VIRTGPU_EXECBUF_FENCE_FD_IN (which checks the dma_fence context).
That requires the X server / compositors to be modified. For example, venus works under Android (where there is explicit fencing) or under a modified compositor (which does DMA_BUF_IOCTL_EXPORT_SYNC_FILE or DRM_IOCTL_VIRTGPU_WAIT). But it does not work too well with an unmodified X server.
Generally, if it only requires virgl changes, userspace changes are fine since OpenGL drivers implement implicit sync in many ways. Waiting on the reservation object in the kernel is fine too though.
I don't think we want to assume virgl to be the only consumer of dma-bufs, despite that it is the most common use case.
Though venus doesn't use the NUM_RINGS param yet. Getting all permutations of context type + display integration working would take some time (patchset mostly tested with wayland + gfxstream/Android [no implicit sync]).
WDYT of someone figuring out virgl/venus interop later, independently of this patchset?
I think we should understand the implications of multiple fence contexts better, even if some changes are not included in this patchset.
From my view, we don't need implicit fencing in most cases and
implicit fencing should be considered a legacy path. But X server / compositors today happen to require it. Other drivers seem to use a flag to control whether implicit fences are set up or waited (e.g., AMDGPU_GEM_CREATE_EXPLICIT_SYNC, MSM_SUBMIT_NO_IMPLICIT, or EXEC_OBJECT_WRITE). It seems to be the least surprising thing to do.
vfpriv->num_rings = value;
break; default: ret = -EINVAL; goto out_unlock;
-- 2.33.0.153.gba50c8fa24-goog
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
On Tue, Sep 14, 2021 at 10:53 AM Chia-I Wu olvaffe@gmail.com wrote:
,On Mon, Sep 13, 2021 at 6:57 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Mon, Sep 13, 2021 at 11:52 AM Chia-I Wu olvaffe@gmail.com wrote:
.
On Mon, Sep 13, 2021 at 10:48 AM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Fri, Sep 10, 2021 at 12:33 PM Chia-I Wu olvaffe@gmail.com wrote:
On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
We don't want fences from different 3D contexts (virgl, gfxstream, venus) to be on the same timeline. With explicit context creation, we can specify the number of ring each context wants.
Execbuffer can specify which ring to use.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com
drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34
++++++++++++++++++++++++--
2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index a5142d60c2fa..cca9ab505deb 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -56,6 +56,7 @@ #define STATE_ERR 2
#define MAX_CAPSET_ID 63 +#define MAX_RINGS 64
struct virtio_gpu_object_params { unsigned long size; @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { uint32_t ctx_id; uint32_t context_init; bool context_created;
uint32_t num_rings;
uint64_t base_fence_ctx; struct mutex context_lock;
};
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index f51f3393a194..262f79210283 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct
drm_device *dev, void *data,
int in_fence_fd = exbuf->fence_fd; int out_fence_fd = -1; void *buf;
uint64_t fence_ctx;
uint32_t ring_idx;
fence_ctx = vgdev->fence_drv.context;
ring_idx = 0; if (vgdev->has_virgl_3d == false) return -ENOSYS;
@@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct
drm_device *dev, void *data,
if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) return -EINVAL;
if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
if (exbuf->ring_idx >= vfpriv->num_rings)
return -EINVAL;
if (!vfpriv->base_fence_ctx)
return -EINVAL;
fence_ctx = vfpriv->base_fence_ctx;
ring_idx = exbuf->ring_idx;
}
exbuf->fence_fd = -1; virtio_gpu_create_context(dev, file);
@@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct
drm_device *dev, void *data,
goto out_memdup; }
out_fence = virtio_gpu_fence_alloc(vgdev,
vgdev->fence_drv.context, 0);
out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx,
ring_idx);
if(!out_fence) { ret = -ENOMEM; goto out_unresv;
@@ -691,7 +707,7 @@ static int
virtio_gpu_context_init_ioctl(struct drm_device *dev,
return -EINVAL; /* Number of unique parameters supported at this time. */
if (num_params > 1)
if (num_params > 2) return -EINVAL; ctx_set_params =
memdup_user(u64_to_user_ptr(args->ctx_set_params),
@@ -731,6 +747,20 @@ static int
virtio_gpu_context_init_ioctl(struct drm_device *dev,
vfpriv->context_init |= value; break;
case VIRTGPU_CONTEXT_PARAM_NUM_RINGS:
if (vfpriv->base_fence_ctx) {
ret = -EINVAL;
goto out_unlock;
}
if (value > MAX_RINGS) {
ret = -EINVAL;
goto out_unlock;
}
vfpriv->base_fence_ctx =
dma_fence_context_alloc(value);
With multiple fence contexts, we should do something about implicit
fencing.
The classic example is Mesa and X server. When both use virgl and
the
global fence context, no dma_fence_wait is fine. But when Mesa uses venus and the ring fence context, dma_fence_wait should be inserted.
If I read your comment correctly, the use case is:
context A (venus)
sharing a render target with
context B (Xserver backed virgl)
?
Which function do you envisage dma_fence_wait(...) to be inserted?
Doesn't implicit synchronization mean there's no fence to share between contexts (only buffer objects)?
Fences can be implicitly shared via reservation objects associated with buffer objects.
It may be possible to wait on the reservation object associated with
a buffer object from a different context (userspace can also do DRM_IOCTL_VIRTGPU_WAIT), but not sure if that's what you're looking for.
Right, that's what I am looking for. Userspace expects implicit fencing to work. While there are works to move the userspace to do explicit fencing, it is not there yet in general and we can't require the userspace to do explicit fencing or DRM_IOCTL_VIRTGPU_WAIT.
Another option would be to use the upcoming
DMA_BUF_IOCTL_EXPORT_SYNC_FILE + VIRTGPU_EXECBUF_FENCE_FD_IN (which checks the dma_fence context). That requires the X server / compositors to be modified. For example, venus works under Android (where there is explicit fencing) or under a modified compositor (which does DMA_BUF_IOCTL_EXPORT_SYNC_FILE or DRM_IOCTL_VIRTGPU_WAIT). But it does not work too well with an unmodified X server.
Some semi-recent virgl modifications will be needed regardless for interop, such as VIRGL_CAP_V2_UNTYPED_RESOURCE (?).
Not sure aren't too many virgl users (most developers)
Does Xserver just pick up the latest Mesa release (including virgl/venus)? Suppose context types land in 5.16, the userspace changes land (both Venus/Virgl) in 21.2 stable releases.
https://docs.mesa3d.org/release-calendar.html
Generally, if it only requires virgl changes, userspace changes are fine
since OpenGL drivers implement implicit sync in many ways. Waiting on the reservation object in the kernel is fine too though. I don't think we want to assume virgl to be the only consumer of dma-bufs, despite that it is the most common use case.
Though venus doesn't use the NUM_RINGS param yet. Getting all
permutations of context type + display integration working would take some time (patchset mostly tested with wayland + gfxstream/Android [no implicit sync]).
WDYT of someone figuring out virgl/venus interop later, independently of
this patchset?
I think we should understand the implications of multiple fence contexts better, even if some changes are not included in this patchset.
From my view, we don't need implicit fencing in most cases and implicit fencing should be considered a legacy path. But X server / compositors today happen to require it. Other drivers seem to use a flag to control whether implicit fences are set up or waited (e.g., AMDGPU_GEM_CREATE_EXPLICIT_SYNC, MSM_SUBMIT_NO_IMPLICIT, or EXEC_OBJECT_WRITE). It seems to be the least surprising thing to do.
IMO, the easiest way is just to limit the change to userspace if possible since implicit sync is legacy/something we want to deprecate over time.
Another option is to add something like VIRTGPU_EXECBUF_EXPLICIT_SYNC (similar to MSM_SUBMIT_NO_IMPLICIT), where the reservation objects are waited on / added to without that flag. Since explicit sync will need new hypercalls/params and is a major, that feature is expected to be independent of context types.
With that option, waiting on the reservation object would just be another bug fix + addition to 5.16 (perhaps by you) so we can proceed in parallel faster. VIRTGPU_EXECBUF_EXPLICIT_SYNC (or an equivalent) would be added later.
vfpriv->num_rings = value;
break; default: ret = -EINVAL; goto out_unlock;
-- 2.33.0.153.gba50c8fa24-goog
To unsubscribe, e-mail:
virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail:
virtio-dev-help@lists.oasis-open.org
i
On Tue, Sep 14, 2021 at 6:26 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Tue, Sep 14, 2021 at 10:53 AM Chia-I Wu olvaffe@gmail.com wrote:
,On Mon, Sep 13, 2021 at 6:57 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Mon, Sep 13, 2021 at 11:52 AM Chia-I Wu olvaffe@gmail.com wrote:
.
On Mon, Sep 13, 2021 at 10:48 AM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Fri, Sep 10, 2021 at 12:33 PM Chia-I Wu olvaffe@gmail.com wrote:
On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh gurchetansingh@chromium.org wrote: > > We don't want fences from different 3D contexts (virgl, gfxstream, > venus) to be on the same timeline. With explicit context creation, > we can specify the number of ring each context wants. > > Execbuffer can specify which ring to use. > > Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org > Acked-by: Lingfeng Yang lfy@google.com > --- > drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ > drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 ++++++++++++++++++++++++-- > 2 files changed, 35 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h > index a5142d60c2fa..cca9ab505deb 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_drv.h > +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h > @@ -56,6 +56,7 @@ > #define STATE_ERR 2 > > #define MAX_CAPSET_ID 63 > +#define MAX_RINGS 64 > > struct virtio_gpu_object_params { > unsigned long size; > @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { > uint32_t ctx_id; > uint32_t context_init; > bool context_created; > + uint32_t num_rings; > + uint64_t base_fence_ctx; > struct mutex context_lock; > }; > > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > index f51f3393a194..262f79210283 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, > int in_fence_fd = exbuf->fence_fd; > int out_fence_fd = -1; > void *buf; > + uint64_t fence_ctx; > + uint32_t ring_idx; > + > + fence_ctx = vgdev->fence_drv.context; > + ring_idx = 0; > > if (vgdev->has_virgl_3d == false) > return -ENOSYS; > @@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, > if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) > return -EINVAL; > > + if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) { > + if (exbuf->ring_idx >= vfpriv->num_rings) > + return -EINVAL; > + > + if (!vfpriv->base_fence_ctx) > + return -EINVAL; > + > + fence_ctx = vfpriv->base_fence_ctx; > + ring_idx = exbuf->ring_idx; > + } > + > exbuf->fence_fd = -1; > > virtio_gpu_create_context(dev, file); > @@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, > goto out_memdup; > } > > - out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0); > + out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx); > if(!out_fence) { > ret = -ENOMEM; > goto out_unresv; > @@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, > return -EINVAL; > > /* Number of unique parameters supported at this time. */ > - if (num_params > 1) > + if (num_params > 2) > return -EINVAL; > > ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params), > @@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, > > vfpriv->context_init |= value; > break; > + case VIRTGPU_CONTEXT_PARAM_NUM_RINGS: > + if (vfpriv->base_fence_ctx) { > + ret = -EINVAL; > + goto out_unlock; > + } > + > + if (value > MAX_RINGS) { > + ret = -EINVAL; > + goto out_unlock; > + } > + > + vfpriv->base_fence_ctx = dma_fence_context_alloc(value); With multiple fence contexts, we should do something about implicit fencing.
The classic example is Mesa and X server. When both use virgl and the global fence context, no dma_fence_wait is fine. But when Mesa uses venus and the ring fence context, dma_fence_wait should be inserted.
If I read your comment correctly, the use case is:
context A (venus)
sharing a render target with
context B (Xserver backed virgl)
?
Which function do you envisage dma_fence_wait(...) to be inserted? Doesn't implicit synchronization mean there's no fence to share between contexts (only buffer objects)?
Fences can be implicitly shared via reservation objects associated with buffer objects.
It may be possible to wait on the reservation object associated with a buffer object from a different context (userspace can also do DRM_IOCTL_VIRTGPU_WAIT), but not sure if that's what you're looking for.
Right, that's what I am looking for. Userspace expects implicit fencing to work. While there are works to move the userspace to do explicit fencing, it is not there yet in general and we can't require the userspace to do explicit fencing or DRM_IOCTL_VIRTGPU_WAIT.
Another option would be to use the upcoming DMA_BUF_IOCTL_EXPORT_SYNC_FILE + VIRTGPU_EXECBUF_FENCE_FD_IN (which checks the dma_fence context).
That requires the X server / compositors to be modified. For example, venus works under Android (where there is explicit fencing) or under a modified compositor (which does DMA_BUF_IOCTL_EXPORT_SYNC_FILE or DRM_IOCTL_VIRTGPU_WAIT). But it does not work too well with an unmodified X server.
Some semi-recent virgl modifications will be needed regardless for interop, such as VIRGL_CAP_V2_UNTYPED_RESOURCE (?).
Not sure aren't too many virgl users (most developers)
Does Xserver just pick up the latest Mesa release (including virgl/venus)? Suppose context types land in 5.16, the userspace changes land (both Venus/Virgl) in 21.2 stable releases.
https://docs.mesa3d.org/release-calendar.html
Generally, if it only requires virgl changes, userspace changes are fine since OpenGL drivers implement implicit sync in many ways. Waiting on the reservation object in the kernel is fine too though.
I don't think we want to assume virgl to be the only consumer of dma-bufs, despite that it is the most common use case.
Though venus doesn't use the NUM_RINGS param yet. Getting all permutations of context type + display integration working would take some time (patchset mostly tested with wayland + gfxstream/Android [no implicit sync]).
WDYT of someone figuring out virgl/venus interop later, independently of this patchset?
I think we should understand the implications of multiple fence contexts better, even if some changes are not included in this patchset.
From my view, we don't need implicit fencing in most cases and implicit fencing should be considered a legacy path. But X server / compositors today happen to require it. Other drivers seem to use a flag to control whether implicit fences are set up or waited (e.g., AMDGPU_GEM_CREATE_EXPLICIT_SYNC, MSM_SUBMIT_NO_IMPLICIT, or EXEC_OBJECT_WRITE). It seems to be the least surprising thing to do.
IMO, the easiest way is just to limit the change to userspace if possible since implicit sync is legacy/something we want to deprecate over time.
Another option is to add something like VIRTGPU_EXECBUF_EXPLICIT_SYNC (similar to MSM_SUBMIT_NO_IMPLICIT), where the reservation objects are waited on / added to without that flag. Since explicit sync will need new hypercalls/params and is a major, that feature is expected to be independent of context types.
With that option, waiting on the reservation object would just be another bug fix + addition to 5.16 (perhaps by you) so we can proceed in parallel faster. VIRTGPU_EXECBUF_EXPLICIT_SYNC (or an equivalent) would be added later.
It is fine to add it later, but VIRTGPU_EXECBUF_EXPLICIT_SYNC sounds better to me than a userspace solution. I don't think we need a new hypercall as the wait can be a guest-side wait, similar to how VIRTGPU_EXECBUF_FENCE_FD_IN is a guest-side wait. The flag can also suppress VIRTIO_GPU_FLAG_FENCE and make the SUBMIT_3D hypercall cheaper.
Even before this patchset, unless I miss something, it seems the fact that we have a global fence context and assume all host GL contexts are on the same timeline is not exactly correct. When glFlush is called on two host GL contexts, the flush order is not exactly the same as the execution order. But that is a different issue that can be solved in virglrenderer.
> + vfpriv->num_rings = value; > + break; > default: > ret = -EINVAL; > goto out_unlock; > -- > 2.33.0.153.gba50c8fa24-goog > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org >
On Wed, Sep 15, 2021 at 5:11 PM Chia-I Wu olvaffe@gmail.com wrote:
i
On Tue, Sep 14, 2021 at 6:26 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Tue, Sep 14, 2021 at 10:53 AM Chia-I Wu olvaffe@gmail.com wrote:
,On Mon, Sep 13, 2021 at 6:57 PM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Mon, Sep 13, 2021 at 11:52 AM Chia-I Wu olvaffe@gmail.com wrote:
.
On Mon, Sep 13, 2021 at 10:48 AM Gurchetan Singh gurchetansingh@chromium.org wrote:
On Fri, Sep 10, 2021 at 12:33 PM Chia-I Wu olvaffe@gmail.com
wrote:
> > On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh > gurchetansingh@chromium.org wrote: > > > > We don't want fences from different 3D contexts (virgl,
gfxstream,
> > venus) to be on the same timeline. With explicit context
creation,
> > we can specify the number of ring each context wants. > > > > Execbuffer can specify which ring to use. > > > > Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org > > Acked-by: Lingfeng Yang lfy@google.com > > --- > > drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ > > drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34
++++++++++++++++++++++++--
> > 2 files changed, 35 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h
b/drivers/gpu/drm/virtio/virtgpu_drv.h
> > index a5142d60c2fa..cca9ab505deb 100644 > > --- a/drivers/gpu/drm/virtio/virtgpu_drv.h > > +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h > > @@ -56,6 +56,7 @@ > > #define STATE_ERR 2 > > > > #define MAX_CAPSET_ID 63 > > +#define MAX_RINGS 64 > > > > struct virtio_gpu_object_params { > > unsigned long size; > > @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { > > uint32_t ctx_id; > > uint32_t context_init; > > bool context_created; > > + uint32_t num_rings; > > + uint64_t base_fence_ctx; > > struct mutex context_lock; > > }; > > > > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> > index f51f3393a194..262f79210283 100644 > > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > @@ -99,6 +99,11 @@ static int
virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
> > int in_fence_fd = exbuf->fence_fd; > > int out_fence_fd = -1; > > void *buf; > > + uint64_t fence_ctx; > > + uint32_t ring_idx; > > + > > + fence_ctx = vgdev->fence_drv.context; > > + ring_idx = 0; > > > > if (vgdev->has_virgl_3d == false) > > return -ENOSYS; > > @@ -106,6 +111,17 @@ static int
virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
> > if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) > > return -EINVAL; > > > > + if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) { > > + if (exbuf->ring_idx >= vfpriv->num_rings) > > + return -EINVAL; > > + > > + if (!vfpriv->base_fence_ctx) > > + return -EINVAL; > > + > > + fence_ctx = vfpriv->base_fence_ctx; > > + ring_idx = exbuf->ring_idx; > > + } > > + > > exbuf->fence_fd = -1; > > > > virtio_gpu_create_context(dev, file); > > @@ -173,7 +189,7 @@ static int
virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
> > goto out_memdup; > > } > > > > - out_fence = virtio_gpu_fence_alloc(vgdev,
vgdev->fence_drv.context, 0);
> > + out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx,
ring_idx);
> > if(!out_fence) { > > ret = -ENOMEM; > > goto out_unresv; > > @@ -691,7 +707,7 @@ static int
virtio_gpu_context_init_ioctl(struct drm_device *dev,
> > return -EINVAL; > > > > /* Number of unique parameters supported at this time.
*/
> > - if (num_params > 1) > > + if (num_params > 2) > > return -EINVAL; > > > > ctx_set_params =
memdup_user(u64_to_user_ptr(args->ctx_set_params),
> > @@ -731,6 +747,20 @@ static int
virtio_gpu_context_init_ioctl(struct drm_device *dev,
> > > > vfpriv->context_init |= value; > > break; > > + case VIRTGPU_CONTEXT_PARAM_NUM_RINGS: > > + if (vfpriv->base_fence_ctx) { > > + ret = -EINVAL; > > + goto out_unlock; > > + } > > + > > + if (value > MAX_RINGS) { > > + ret = -EINVAL; > > + goto out_unlock; > > + } > > + > > + vfpriv->base_fence_ctx =
dma_fence_context_alloc(value);
> With multiple fence contexts, we should do something about
implicit fencing.
> > The classic example is Mesa and X server. When both use virgl
and the
> global fence context, no dma_fence_wait is fine. But when Mesa
uses
> venus and the ring fence context, dma_fence_wait should be
inserted.
If I read your comment correctly, the use case is:
context A (venus)
sharing a render target with
context B (Xserver backed virgl)
?
Which function do you envisage dma_fence_wait(...) to be
inserted? Doesn't implicit synchronization mean there's no fence to share between contexts (only buffer objects)?
Fences can be implicitly shared via reservation objects associated with buffer objects.
It may be possible to wait on the reservation object associated
with a buffer object from a different context (userspace can also do DRM_IOCTL_VIRTGPU_WAIT), but not sure if that's what you're looking for.
Right, that's what I am looking for. Userspace expects implicit fencing to work. While there are works to move the userspace to do explicit fencing, it is not there yet in general and we can't require the userspace to do explicit fencing or DRM_IOCTL_VIRTGPU_WAIT.
Another option would be to use the upcoming
DMA_BUF_IOCTL_EXPORT_SYNC_FILE + VIRTGPU_EXECBUF_FENCE_FD_IN (which checks the dma_fence context).
That requires the X server / compositors to be modified. For example, venus works under Android (where there is explicit fencing) or under a modified compositor (which does DMA_BUF_IOCTL_EXPORT_SYNC_FILE or DRM_IOCTL_VIRTGPU_WAIT). But it does not work too well with an unmodified X server.
Some semi-recent virgl modifications will be needed regardless for
interop, such as VIRGL_CAP_V2_UNTYPED_RESOURCE (?).
Not sure aren't too many virgl users (most developers)
Does Xserver just pick up the latest Mesa release (including
virgl/venus)? Suppose context types land in 5.16, the userspace changes land (both Venus/Virgl) in 21.2 stable releases.
https://docs.mesa3d.org/release-calendar.html
Generally, if it only requires virgl changes, userspace changes are
fine since OpenGL drivers implement implicit sync in many ways. Waiting on the reservation object in the kernel is fine too though.
I don't think we want to assume virgl to be the only consumer of dma-bufs, despite that it is the most common use case.
Though venus doesn't use the NUM_RINGS param yet. Getting all
permutations of context type + display integration working would take some time (patchset mostly tested with wayland + gfxstream/Android [no implicit sync]).
WDYT of someone figuring out virgl/venus interop later, independently
of this patchset?
I think we should understand the implications of multiple fence contexts better, even if some changes are not included in this patchset.
From my view, we don't need implicit fencing in most cases and implicit fencing should be considered a legacy path. But X server / compositors today happen to require it. Other drivers seem to use a flag to control whether implicit fences are set up or waited (e.g., AMDGPU_GEM_CREATE_EXPLICIT_SYNC, MSM_SUBMIT_NO_IMPLICIT, or EXEC_OBJECT_WRITE). It seems to be the least surprising thing to do.
IMO, the easiest way is just to limit the change to userspace if
possible since implicit sync is legacy/something we want to deprecate over time.
Another option is to add something like VIRTGPU_EXECBUF_EXPLICIT_SYNC
(similar to MSM_SUBMIT_NO_IMPLICIT), where the reservation objects are waited on / added to without that flag. Since explicit sync will need new hypercalls/params and is a major, that feature is expected to be independent of context types.
With that option, waiting on the reservation object would just be
another bug fix + addition to 5.16 (perhaps by you) so we can proceed in parallel faster. VIRTGPU_EXECBUF_EXPLICIT_SYNC (or an equivalent) would be added later.
It is fine to add it later, but VIRTGPU_EXECBUF_EXPLICIT_SYNC sounds better to me than a userspace solution. I don't think we need a new hypercall as the wait can be a guest-side wait, similar to how VIRTGPU_EXECBUF_FENCE_FD_IN is a guest-side wait. The flag can also suppress VIRTIO_GPU_FLAG_FENCE and make the SUBMIT_3D hypercall cheaper.
Okay, I will add a note regarding the plan in patch 9 of v2 for context types that need implicit sync.
Even before this patchset, unless I miss something, it seems the fact that we have a global fence context and assume all host GL contexts are on the same timeline is not exactly correct. When glFlush is called on two host GL contexts, the flush order is not exactly the same as the execution order. But that is a different issue that can be solved in virglrenderer.
> > > > + vfpriv->num_rings = value; > > + break; > > default: > > ret = -EINVAL; > > goto out_unlock; > > -- > > 2.33.0.153.gba50c8fa24-goog > > > > > >
> > To unsubscribe, e-mail:
virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail:
virtio-dev-help@lists.oasis-open.org
> >
For the Sommelier guest Wayland proxy, it's desirable for the DRM fd to be pollable in response to an host compositor event. This can also be used by the 3D driver to poll events on a CPU timeline.
This enables the DRM fd associated with a particular 3D context to be polled independent of KMS events. The parameter VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK specifies the pollable rings.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Nicholas Verne nverne@chromium.org --- drivers/gpu/drm/virtio/virtgpu_drv.h | 1 + drivers/gpu/drm/virtio/virtgpu_ioctl.c | 22 +++++++++++++++++++++- 2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index cca9ab505deb..cb60d52c2bd1 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -266,6 +266,7 @@ struct virtio_gpu_fpriv { bool context_created; uint32_t num_rings; uint64_t base_fence_ctx; + uint64_t ring_idx_mask; struct mutex context_lock; };
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index 262f79210283..be7b22a03884 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -694,6 +694,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, { int ret = 0; uint32_t num_params, i, param, value; + uint64_t valid_ring_mask; size_t len; struct drm_virtgpu_context_set_param *ctx_set_params = NULL; struct virtio_gpu_device *vgdev = dev->dev_private; @@ -707,7 +708,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, return -EINVAL;
/* Number of unique parameters supported at this time. */ - if (num_params > 2) + if (num_params > 3) return -EINVAL;
ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params), @@ -761,12 +762,31 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, vfpriv->base_fence_ctx = dma_fence_context_alloc(value); vfpriv->num_rings = value; break; + case VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK: + if (vfpriv->ring_idx_mask) { + ret = -EINVAL; + goto out_unlock; + } + + vfpriv->ring_idx_mask = value; + break; default: ret = -EINVAL; goto out_unlock; } }
+ if (vfpriv->ring_idx_mask) { + valid_ring_mask = 0; + for (i = 0; i < vfpriv->num_rings; i++) + valid_ring_mask |= 1 << i; + + if (~valid_ring_mask & vfpriv->ring_idx_mask) { + ret = -EINVAL; + goto out_unlock; + } + } + virtio_gpu_create_context_locked(vgdev, vfpriv); virtio_gpu_notify(vgdev);
Similar to DRM_VMW_EVENT_FENCE_SIGNALED. Sends a pollable event to the DRM file descriptor when a fence on a specific ring is signaled.
One difference is the event is not exposed via the UAPI -- this is because host responses are on a shared memory buffer of type BLOB_MEM_GUEST [this is the common way to receive responses with virtgpu]. As such, there is no context specific read(..) implementation either -- just a poll(..) implementation.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Nicholas Verne nverne@chromium.org --- drivers/gpu/drm/virtio/virtgpu_drv.c | 43 +++++++++++++++++++++++++- drivers/gpu/drm/virtio/virtgpu_drv.h | 7 +++++ drivers/gpu/drm/virtio/virtgpu_fence.c | 10 ++++++ drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 ++++++++++++++++++++ 4 files changed, 93 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c index 9d963f1fda8f..749db18dcfa2 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.c +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c @@ -29,6 +29,8 @@ #include <linux/module.h> #include <linux/console.h> #include <linux/pci.h> +#include <linux/poll.h> +#include <linux/wait.h>
#include <drm/drm.h> #include <drm/drm_aperture.h> @@ -155,6 +157,35 @@ static void virtio_gpu_config_changed(struct virtio_device *vdev) schedule_work(&vgdev->config_changed_work); }
+static __poll_t virtio_gpu_poll(struct file *filp, + struct poll_table_struct *wait) +{ + struct drm_file *drm_file = filp->private_data; + struct virtio_gpu_fpriv *vfpriv = drm_file->driver_priv; + struct drm_device *dev = drm_file->minor->dev; + struct drm_pending_event *e = NULL; + __poll_t mask = 0; + + if (!vfpriv->ring_idx_mask) + return drm_poll(filp, wait); + + poll_wait(filp, &drm_file->event_wait, wait); + + if (!list_empty(&drm_file->event_list)) { + spin_lock_irq(&dev->event_lock); + e = list_first_entry(&drm_file->event_list, + struct drm_pending_event, link); + drm_file->event_space += e->event->length; + list_del(&e->link); + spin_unlock_irq(&dev->event_lock); + + kfree(e); + mask |= EPOLLIN | EPOLLRDNORM; + } + + return mask; +} + static struct virtio_device_id id_table[] = { { VIRTIO_ID_GPU, VIRTIO_DEV_ANY_ID }, { 0 }, @@ -194,7 +225,17 @@ MODULE_AUTHOR("Dave Airlie airlied@redhat.com"); MODULE_AUTHOR("Gerd Hoffmann kraxel@redhat.com"); MODULE_AUTHOR("Alon Levy");
-DEFINE_DRM_GEM_FOPS(virtio_gpu_driver_fops); +static const struct file_operations virtio_gpu_driver_fops = { + .owner = THIS_MODULE, + .open = drm_open, + .release = drm_release, + .unlocked_ioctl = drm_ioctl, + .compat_ioctl = drm_compat_ioctl, + .poll = virtio_gpu_poll, + .read = drm_read, + .llseek = noop_llseek, + .mmap = drm_gem_mmap +};
static const struct drm_driver driver = { .driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC, diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index cb60d52c2bd1..e0265fe74aa5 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -138,11 +138,18 @@ struct virtio_gpu_fence_driver { spinlock_t lock; };
+#define VIRTGPU_EVENT_FENCE_SIGNALED_INTERNAL 0x10000000 +struct virtio_gpu_fence_event { + struct drm_pending_event base; + struct drm_event event; +}; + struct virtio_gpu_fence { struct dma_fence f; uint32_t ring_idx; uint64_t fence_id; bool emit_fence_info; + struct virtio_gpu_fence_event *e; struct virtio_gpu_fence_driver *drv; struct list_head node; }; diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c b/drivers/gpu/drm/virtio/virtgpu_fence.c index 98a00c1e654d..f28357dbde35 100644 --- a/drivers/gpu/drm/virtio/virtgpu_fence.c +++ b/drivers/gpu/drm/virtio/virtgpu_fence.c @@ -152,11 +152,21 @@ void virtio_gpu_fence_event_process(struct virtio_gpu_device *vgdev, continue;
dma_fence_signal_locked(&curr->f); + if (curr->e) { + drm_send_event(vgdev->ddev, &curr->e->base); + curr->e = NULL; + } + list_del(&curr->node); dma_fence_put(&curr->f); }
dma_fence_signal_locked(&signaled->f); + if (signaled->e) { + drm_send_event(vgdev->ddev, &signaled->e->base); + signaled->e = NULL; + } + list_del(&signaled->node); dma_fence_put(&signaled->f); break; diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index be7b22a03884..fdaa7f3d9eeb 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -38,6 +38,36 @@ VIRTGPU_BLOB_FLAG_USE_SHAREABLE | \ VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE)
+static int virtio_gpu_fence_event_create(struct drm_device *dev, + struct drm_file *file, + struct virtio_gpu_fence *fence, + uint32_t ring_idx) +{ + struct virtio_gpu_fpriv *vfpriv = file->driver_priv; + struct virtio_gpu_fence_event *e = NULL; + int ret; + + if (!(vfpriv->ring_idx_mask & (1 << ring_idx))) + return 0; + + e = kzalloc(sizeof(*e), GFP_KERNEL); + if (!e) + return -ENOMEM; + + e->event.type = VIRTGPU_EVENT_FENCE_SIGNALED_INTERNAL; + e->event.length = sizeof(e->event); + + ret = drm_event_reserve_init(dev, file, &e->base, &e->event); + if (ret) + goto free; + + fence->e = e; + return 0; +free: + kfree(e); + return ret; +} + /* Must be called with &virtio_gpu_fpriv.struct_mutex held. */ static void virtio_gpu_create_context_locked(struct virtio_gpu_device *vgdev, struct virtio_gpu_fpriv *vfpriv) @@ -195,6 +225,10 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, goto out_unresv; }
+ ret = virtio_gpu_fence_event_create(dev, file, out_fence, ring_idx); + if (ret) + goto out_unresv; + if (out_fence_fd >= 0) { sync_file = sync_file_create(&out_fence->f); if (!sync_file) {
This advertises the context init feature to userspace, along with a mask of supported capabilities.
Signed-off-by: Gurchetan Singh gurchetansingh@chromium.org Acked-by: Lingfeng Yang lfy@google.com --- drivers/gpu/drm/virtio/virtgpu_ioctl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index fdaa7f3d9eeb..5618a1d5879c 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -286,6 +286,12 @@ static int virtio_gpu_getparam_ioctl(struct drm_device *dev, void *data, case VIRTGPU_PARAM_CROSS_DEVICE: value = vgdev->has_resource_assign_uuid ? 1 : 0; break; + case VIRTGPU_PARAM_CONTEXT_INIT: + value = vgdev->has_context_init ? 1 : 0; + break; + case VIRTGPU_PARAM_SUPPORTED_CAPSET_IDs: + value = vgdev->capset_id_mask; + break; default: return -EINVAL; }
dri-devel@lists.freedesktop.org