Prior series have transformed other parts of VFIO from working on struct device or struct vfio_group into working directly on struct vfio_device. Based on that work we now have vfio_device's readily available in all the drivers.
Update the rest of the driver facing API to use vfio_device as an input.
The following are switched from struct device to struct vfio_device: vfio_register_notifier() vfio_unregister_notifier() vfio_pin_pages() vfio_unpin_pages() vfio_dma_rw()
The following group APIs are obsoleted and removed by just using struct vfio_device with the above: vfio_group_pin_pages() vfio_group_unpin_pages() vfio_group_iommu_domain() vfio_group_get_external_user_from_dev()
To retain the performance of the new device APIs relative to their group versions optimize how vfio_group_add_container_user() is used to avoid calling it when the driver must already guarantee the device is open and the container_users incrd.
The remaining exported VFIO group interfaces are only used by kvm, and are addressed by a parallel series.
There is a conflict with Christoph's gvt rework here:
https://lore.kernel.org/all/20220411141403.86980-1-hch@lst.de/
I've organized this so it is independent of Christoph's series, by adding the temporary mdev_legacy_get_vfio_device(), however it is easy for me to rebase. We can decide what to do as we see what becomes mergable. My preference would be to see Christoph's series merged into the drm&vfio trees and we do both series this cycle.
I have a followup series that needs this.
This is also part of the iommufd work - moving the driver facing interface to vfio_device provides a much cleaner path to integrate with iommufd.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com
Jason Gunthorpe (9): vfio: Make vfio_(un)register_notifier accept a vfio_device vfio/ccw: Remove mdev from struct channel_program vfio/mdev: Pass in a struct vfio_device * to vfio_pin/unpin_pages() drm/i915/gvt: Change from vfio_group_(un)pin_pages to vfio_(un)pin_pages vfio: Pass in a struct vfio_device * to vfio_dma_rw() drm/i915/gvt: Add missing module_put() in error unwind drm/i915/gvt: Delete kvmgt_vdev::vfio_group vfio: Remove dead code vfio: Remove calls to vfio_group_add_container_user()
.../driver-api/vfio-mediated-device.rst | 4 +- drivers/gpu/drm/i915/gvt/kvmgt.c | 48 ++- drivers/s390/cio/vfio_ccw_cp.c | 44 +-- drivers/s390/cio/vfio_ccw_cp.h | 4 +- drivers/s390/cio/vfio_ccw_fsm.c | 3 +- drivers/s390/cio/vfio_ccw_ops.c | 7 +- drivers/s390/crypto/vfio_ap_ops.c | 22 +- drivers/vfio/mdev/vfio_mdev.c | 12 + drivers/vfio/vfio.c | 283 ++---------------- include/linux/mdev.h | 1 + include/linux/vfio.h | 21 +- 11 files changed, 115 insertions(+), 334 deletions(-)
base-commit: ce522ba9ef7e2d9fb22a39eb3371c0c64e2a433e
All callers have a struct vfio_device trivially available, pass it in directly and avoid calling the expensive vfio_group_get_from_dev().
To support the unconverted kvmgt mdev driver add mdev_legacy_get_vfio_device() which will return the vfio_device pointer vfio_mdev.c puts in the drv_data.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- drivers/gpu/drm/i915/gvt/kvmgt.c | 15 +++++++++------ drivers/s390/cio/vfio_ccw_ops.c | 7 +++---- drivers/s390/crypto/vfio_ap_ops.c | 14 +++++++------- drivers/vfio/mdev/vfio_mdev.c | 12 ++++++++++++ drivers/vfio/vfio.c | 25 +++++++------------------ include/linux/mdev.h | 1 + include/linux/vfio.h | 4 ++-- 7 files changed, 41 insertions(+), 37 deletions(-)
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 057ec449010458..bb59d21cf898ab 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -904,6 +904,7 @@ static int intel_vgpu_group_notifier(struct notifier_block *nb,
static int intel_vgpu_open_device(struct mdev_device *mdev) { + struct vfio_device *vfio_dev = mdev_legacy_get_vfio_device(mdev); struct intel_vgpu *vgpu = mdev_get_drvdata(mdev); struct kvmgt_vdev *vdev = kvmgt_vdev(vgpu); unsigned long events; @@ -914,7 +915,7 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) vdev->group_notifier.notifier_call = intel_vgpu_group_notifier;
events = VFIO_IOMMU_NOTIFY_DMA_UNMAP; - ret = vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY, &events, + ret = vfio_register_notifier(vfio_dev, VFIO_IOMMU_NOTIFY, &events, &vdev->iommu_notifier); if (ret != 0) { gvt_vgpu_err("vfio_register_notifier for iommu failed: %d\n", @@ -923,7 +924,7 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) }
events = VFIO_GROUP_NOTIFY_SET_KVM; - ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY, &events, + ret = vfio_register_notifier(vfio_dev, VFIO_GROUP_NOTIFY, &events, &vdev->group_notifier); if (ret != 0) { gvt_vgpu_err("vfio_register_notifier for group failed: %d\n", @@ -961,11 +962,11 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) vdev->vfio_group = NULL;
undo_register: - vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY, + vfio_unregister_notifier(vfio_dev, VFIO_GROUP_NOTIFY, &vdev->group_notifier);
undo_iommu: - vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY, + vfio_unregister_notifier(vfio_dev, VFIO_IOMMU_NOTIFY, &vdev->iommu_notifier); out: return ret; @@ -988,6 +989,7 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu) struct kvmgt_vdev *vdev = kvmgt_vdev(vgpu); struct drm_i915_private *i915 = vgpu->gvt->gt->i915; struct kvmgt_guest_info *info; + struct vfio_device *vfio_dev; int ret;
if (!handle_valid(vgpu->handle)) @@ -998,12 +1000,13 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu)
intel_gvt_ops->vgpu_release(vgpu);
- ret = vfio_unregister_notifier(mdev_dev(vdev->mdev), VFIO_IOMMU_NOTIFY, + vfio_dev = mdev_legacy_get_vfio_device(vdev->mdev); + ret = vfio_unregister_notifier(vfio_dev, VFIO_IOMMU_NOTIFY, &vdev->iommu_notifier); drm_WARN(&i915->drm, ret, "vfio_unregister_notifier for iommu failed: %d\n", ret);
- ret = vfio_unregister_notifier(mdev_dev(vdev->mdev), VFIO_GROUP_NOTIFY, + ret = vfio_unregister_notifier(vfio_dev, VFIO_GROUP_NOTIFY, &vdev->group_notifier); drm_WARN(&i915->drm, ret, "vfio_unregister_notifier for group failed: %d\n", ret); diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c index d8589afac272f1..e1ce24d8fb2555 100644 --- a/drivers/s390/cio/vfio_ccw_ops.c +++ b/drivers/s390/cio/vfio_ccw_ops.c @@ -183,7 +183,7 @@ static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
private->nb.notifier_call = vfio_ccw_mdev_notifier;
- ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY, + ret = vfio_register_notifier(vdev, VFIO_IOMMU_NOTIFY, &events, &private->nb); if (ret) return ret; @@ -204,8 +204,7 @@ static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
out_unregister: vfio_ccw_unregister_dev_regions(private); - vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY, - &private->nb); + vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, &private->nb); return ret; }
@@ -223,7 +222,7 @@ static void vfio_ccw_mdev_close_device(struct vfio_device *vdev)
cp_free(&private->cp); vfio_ccw_unregister_dev_regions(private); - vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY, &private->nb); + vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, &private->nb); }
static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private, diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 6e08d04b605d6e..69768061cd7bd9 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1406,21 +1406,21 @@ static int vfio_ap_mdev_open_device(struct vfio_device *vdev) matrix_mdev->group_notifier.notifier_call = vfio_ap_mdev_group_notifier; events = VFIO_GROUP_NOTIFY_SET_KVM;
- ret = vfio_register_notifier(vdev->dev, VFIO_GROUP_NOTIFY, - &events, &matrix_mdev->group_notifier); + ret = vfio_register_notifier(vdev, VFIO_GROUP_NOTIFY, &events, + &matrix_mdev->group_notifier); if (ret) return ret;
matrix_mdev->iommu_notifier.notifier_call = vfio_ap_mdev_iommu_notifier; events = VFIO_IOMMU_NOTIFY_DMA_UNMAP; - ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY, - &events, &matrix_mdev->iommu_notifier); + ret = vfio_register_notifier(vdev, VFIO_IOMMU_NOTIFY, &events, + &matrix_mdev->iommu_notifier); if (ret) goto out_unregister_group; return 0;
out_unregister_group: - vfio_unregister_notifier(vdev->dev, VFIO_GROUP_NOTIFY, + vfio_unregister_notifier(vdev, VFIO_GROUP_NOTIFY, &matrix_mdev->group_notifier); return ret; } @@ -1430,9 +1430,9 @@ static void vfio_ap_mdev_close_device(struct vfio_device *vdev) struct ap_matrix_mdev *matrix_mdev = container_of(vdev, struct ap_matrix_mdev, vdev);
- vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY, + vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, &matrix_mdev->iommu_notifier); - vfio_unregister_notifier(vdev->dev, VFIO_GROUP_NOTIFY, + vfio_unregister_notifier(vdev, VFIO_GROUP_NOTIFY, &matrix_mdev->group_notifier); vfio_ap_mdev_unset_kvm(matrix_mdev); } diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c index a90e24b0c851d3..91605c1e8c8f94 100644 --- a/drivers/vfio/mdev/vfio_mdev.c +++ b/drivers/vfio/mdev/vfio_mdev.c @@ -17,6 +17,18 @@
#include "mdev_private.h"
+/* + * Return the struct vfio_device for the mdev when using the legacy + * vfio_mdev_dev_ops path. No new callers to this function should be added. + */ +struct vfio_device *mdev_legacy_get_vfio_device(struct mdev_device *mdev) +{ + if (WARN_ON(mdev->dev.driver != &vfio_mdev_driver.driver)) + return NULL; + return dev_get_drvdata(&mdev->dev); +} +EXPORT_SYMBOL_GPL(mdev_legacy_get_vfio_device); + static int vfio_mdev_open_device(struct vfio_device *core_vdev) { struct mdev_device *mdev = to_mdev_device(core_vdev->dev); diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index a4555014bd1e72..8a5c46aa2bef61 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -2484,19 +2484,15 @@ static int vfio_unregister_group_notifier(struct vfio_group *group, return ret; }
-int vfio_register_notifier(struct device *dev, enum vfio_notify_type type, +int vfio_register_notifier(struct vfio_device *dev, enum vfio_notify_type type, unsigned long *events, struct notifier_block *nb) { - struct vfio_group *group; + struct vfio_group *group = dev->group; int ret;
- if (!dev || !nb || !events || (*events == 0)) + if (!nb || !events || (*events == 0)) return -EINVAL;
- group = vfio_group_get_from_dev(dev); - if (!group) - return -ENODEV; - switch (type) { case VFIO_IOMMU_NOTIFY: ret = vfio_register_iommu_notifier(group, events, nb); @@ -2507,25 +2503,20 @@ int vfio_register_notifier(struct device *dev, enum vfio_notify_type type, default: ret = -EINVAL; } - - vfio_group_put(group); return ret; } EXPORT_SYMBOL(vfio_register_notifier);
-int vfio_unregister_notifier(struct device *dev, enum vfio_notify_type type, +int vfio_unregister_notifier(struct vfio_device *dev, + enum vfio_notify_type type, struct notifier_block *nb) { - struct vfio_group *group; + struct vfio_group *group = dev->group; int ret;
- if (!dev || !nb) + if (!nb) return -EINVAL;
- group = vfio_group_get_from_dev(dev); - if (!group) - return -ENODEV; - switch (type) { case VFIO_IOMMU_NOTIFY: ret = vfio_unregister_iommu_notifier(group, nb); @@ -2536,8 +2527,6 @@ int vfio_unregister_notifier(struct device *dev, enum vfio_notify_type type, default: ret = -EINVAL; } - - vfio_group_put(group); return ret; } EXPORT_SYMBOL(vfio_unregister_notifier); diff --git a/include/linux/mdev.h b/include/linux/mdev.h index 15d03f6532d073..67d07220a28f29 100644 --- a/include/linux/mdev.h +++ b/include/linux/mdev.h @@ -29,6 +29,7 @@ static inline struct mdev_device *to_mdev_device(struct device *dev) unsigned int mdev_get_type_group_id(struct mdev_device *mdev); unsigned int mtype_get_type_group_id(struct mdev_type *mtype); struct device *mtype_get_parent_dev(struct mdev_type *mtype); +struct vfio_device *mdev_legacy_get_vfio_device(struct mdev_device *mdev);
/** * struct mdev_parent_ops - Structure to be registered for each parent device to diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 66dda06ec42d1b..748ec0e0293aea 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -178,11 +178,11 @@ enum vfio_notify_type { /* events for VFIO_GROUP_NOTIFY */ #define VFIO_GROUP_NOTIFY_SET_KVM BIT(0)
-extern int vfio_register_notifier(struct device *dev, +extern int vfio_register_notifier(struct vfio_device *dev, enum vfio_notify_type type, unsigned long *required_events, struct notifier_block *nb); -extern int vfio_unregister_notifier(struct device *dev, +extern int vfio_unregister_notifier(struct vfio_device *dev, enum vfio_notify_type type, struct notifier_block *nb);
On Tue, 2022-04-12 at 12:53 -0300, Jason Gunthorpe wrote:
All callers have a struct vfio_device trivially available, pass it in directly and avoid calling the expensive vfio_group_get_from_dev().
To support the unconverted kvmgt mdev driver add mdev_legacy_get_vfio_device() which will return the vfio_device pointer vfio_mdev.c puts in the drv_data.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com
drivers/gpu/drm/i915/gvt/kvmgt.c | 15 +++++++++------ drivers/s390/cio/vfio_ccw_ops.c | 7 +++---- drivers/s390/crypto/vfio_ap_ops.c | 14 +++++++------- drivers/vfio/mdev/vfio_mdev.c | 12 ++++++++++++ drivers/vfio/vfio.c | 25 +++++++------------------ include/linux/mdev.h | 1 + include/linux/vfio.h | 4 ++-- 7 files changed, 41 insertions(+), 37 deletions(-)
For the -ccw bits:
Acked-by: Eric Farman farman@linux.ibm.com
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 057ec449010458..bb59d21cf898ab 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -904,6 +904,7 @@ static int intel_vgpu_group_notifier(struct notifier_block *nb,
static int intel_vgpu_open_device(struct mdev_device *mdev) {
- struct vfio_device *vfio_dev =
mdev_legacy_get_vfio_device(mdev); struct intel_vgpu *vgpu = mdev_get_drvdata(mdev); struct kvmgt_vdev *vdev = kvmgt_vdev(vgpu); unsigned long events; @@ -914,7 +915,7 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) vdev->group_notifier.notifier_call = intel_vgpu_group_notifier;
events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
- ret = vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
&events,
- ret = vfio_register_notifier(vfio_dev, VFIO_IOMMU_NOTIFY,
&events, &vdev->iommu_notifier); if (ret != 0) { gvt_vgpu_err("vfio_register_notifier for iommu failed: %d\n", @@ -923,7 +924,7 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) }
events = VFIO_GROUP_NOTIFY_SET_KVM;
- ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
&events,
- ret = vfio_register_notifier(vfio_dev, VFIO_GROUP_NOTIFY,
&events, &vdev->group_notifier); if (ret != 0) { gvt_vgpu_err("vfio_register_notifier for group failed: %d\n", @@ -961,11 +962,11 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) vdev->vfio_group = NULL;
undo_register:
- vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
- vfio_unregister_notifier(vfio_dev, VFIO_GROUP_NOTIFY, &vdev->group_notifier);
undo_iommu:
- vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
- vfio_unregister_notifier(vfio_dev, VFIO_IOMMU_NOTIFY, &vdev->iommu_notifier);
out: return ret; @@ -988,6 +989,7 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu) struct kvmgt_vdev *vdev = kvmgt_vdev(vgpu); struct drm_i915_private *i915 = vgpu->gvt->gt->i915; struct kvmgt_guest_info *info;
struct vfio_device *vfio_dev; int ret;
if (!handle_valid(vgpu->handle))
@@ -998,12 +1000,13 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu)
intel_gvt_ops->vgpu_release(vgpu);
- ret = vfio_unregister_notifier(mdev_dev(vdev->mdev),
VFIO_IOMMU_NOTIFY,
- vfio_dev = mdev_legacy_get_vfio_device(vdev->mdev);
- ret = vfio_unregister_notifier(vfio_dev, VFIO_IOMMU_NOTIFY, &vdev->iommu_notifier); drm_WARN(&i915->drm, ret, "vfio_unregister_notifier for iommu failed: %d\n",
ret);
- ret = vfio_unregister_notifier(mdev_dev(vdev->mdev),
VFIO_GROUP_NOTIFY,
- ret = vfio_unregister_notifier(vfio_dev, VFIO_GROUP_NOTIFY, &vdev->group_notifier); drm_WARN(&i915->drm, ret, "vfio_unregister_notifier for group failed: %d\n",
ret); diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c index d8589afac272f1..e1ce24d8fb2555 100644 --- a/drivers/s390/cio/vfio_ccw_ops.c +++ b/drivers/s390/cio/vfio_ccw_ops.c @@ -183,7 +183,7 @@ static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
private->nb.notifier_call = vfio_ccw_mdev_notifier;
- ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
- ret = vfio_register_notifier(vdev, VFIO_IOMMU_NOTIFY, &events, &private->nb); if (ret) return ret;
@@ -204,8 +204,7 @@ static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
out_unregister: vfio_ccw_unregister_dev_regions(private);
- vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
&private->nb);
- vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, &private-
nb);
return ret; }
@@ -223,7 +222,7 @@ static void vfio_ccw_mdev_close_device(struct vfio_device *vdev)
cp_free(&private->cp); vfio_ccw_unregister_dev_regions(private);
- vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
&private->nb);
- vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, &private-
nb);
}
static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private, diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 6e08d04b605d6e..69768061cd7bd9 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1406,21 +1406,21 @@ static int vfio_ap_mdev_open_device(struct vfio_device *vdev) matrix_mdev->group_notifier.notifier_call = vfio_ap_mdev_group_notifier; events = VFIO_GROUP_NOTIFY_SET_KVM;
- ret = vfio_register_notifier(vdev->dev, VFIO_GROUP_NOTIFY,
&events, &matrix_mdev-
group_notifier);
ret = vfio_register_notifier(vdev, VFIO_GROUP_NOTIFY, &events,
&matrix_mdev->group_notifier);
if (ret) return ret;
matrix_mdev->iommu_notifier.notifier_call =
vfio_ap_mdev_iommu_notifier; events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
- ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
&events, &matrix_mdev-
iommu_notifier);
- ret = vfio_register_notifier(vdev, VFIO_IOMMU_NOTIFY, &events,
if (ret) goto out_unregister_group; return 0;&matrix_mdev->iommu_notifier);
out_unregister_group:
- vfio_unregister_notifier(vdev->dev, VFIO_GROUP_NOTIFY,
- vfio_unregister_notifier(vdev, VFIO_GROUP_NOTIFY, &matrix_mdev->group_notifier); return ret;
} @@ -1430,9 +1430,9 @@ static void vfio_ap_mdev_close_device(struct vfio_device *vdev) struct ap_matrix_mdev *matrix_mdev = container_of(vdev, struct ap_matrix_mdev, vdev);
- vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
- vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, &matrix_mdev->iommu_notifier);
- vfio_unregister_notifier(vdev->dev, VFIO_GROUP_NOTIFY,
- vfio_unregister_notifier(vdev, VFIO_GROUP_NOTIFY, &matrix_mdev->group_notifier); vfio_ap_mdev_unset_kvm(matrix_mdev);
} diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c index a90e24b0c851d3..91605c1e8c8f94 100644 --- a/drivers/vfio/mdev/vfio_mdev.c +++ b/drivers/vfio/mdev/vfio_mdev.c @@ -17,6 +17,18 @@
#include "mdev_private.h"
+/*
- Return the struct vfio_device for the mdev when using the legacy
- vfio_mdev_dev_ops path. No new callers to this function should be
added.
- */
+struct vfio_device *mdev_legacy_get_vfio_device(struct mdev_device *mdev) +{
- if (WARN_ON(mdev->dev.driver != &vfio_mdev_driver.driver))
return NULL;
- return dev_get_drvdata(&mdev->dev);
+} +EXPORT_SYMBOL_GPL(mdev_legacy_get_vfio_device);
static int vfio_mdev_open_device(struct vfio_device *core_vdev) { struct mdev_device *mdev = to_mdev_device(core_vdev->dev); diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index a4555014bd1e72..8a5c46aa2bef61 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -2484,19 +2484,15 @@ static int vfio_unregister_group_notifier(struct vfio_group *group, return ret; }
-int vfio_register_notifier(struct device *dev, enum vfio_notify_type type, +int vfio_register_notifier(struct vfio_device *dev, enum vfio_notify_type type, unsigned long *events, struct notifier_block *nb) {
- struct vfio_group *group;
- struct vfio_group *group = dev->group; int ret;
- if (!dev || !nb || !events || (*events == 0))
- if (!nb || !events || (*events == 0)) return -EINVAL;
- group = vfio_group_get_from_dev(dev);
- if (!group)
return -ENODEV;
- switch (type) { case VFIO_IOMMU_NOTIFY: ret = vfio_register_iommu_notifier(group, events, nb);
@@ -2507,25 +2503,20 @@ int vfio_register_notifier(struct device *dev, enum vfio_notify_type type, default: ret = -EINVAL; }
- vfio_group_put(group); return ret;
} EXPORT_SYMBOL(vfio_register_notifier);
-int vfio_unregister_notifier(struct device *dev, enum vfio_notify_type type, +int vfio_unregister_notifier(struct vfio_device *dev,
enum vfio_notify_type type, struct notifier_block *nb)
{
- struct vfio_group *group;
- struct vfio_group *group = dev->group; int ret;
- if (!dev || !nb)
- if (!nb) return -EINVAL;
- group = vfio_group_get_from_dev(dev);
- if (!group)
return -ENODEV;
- switch (type) { case VFIO_IOMMU_NOTIFY: ret = vfio_unregister_iommu_notifier(group, nb);
@@ -2536,8 +2527,6 @@ int vfio_unregister_notifier(struct device *dev, enum vfio_notify_type type, default: ret = -EINVAL; }
- vfio_group_put(group); return ret;
} EXPORT_SYMBOL(vfio_unregister_notifier); diff --git a/include/linux/mdev.h b/include/linux/mdev.h index 15d03f6532d073..67d07220a28f29 100644 --- a/include/linux/mdev.h +++ b/include/linux/mdev.h @@ -29,6 +29,7 @@ static inline struct mdev_device *to_mdev_device(struct device *dev) unsigned int mdev_get_type_group_id(struct mdev_device *mdev); unsigned int mtype_get_type_group_id(struct mdev_type *mtype); struct device *mtype_get_parent_dev(struct mdev_type *mtype); +struct vfio_device *mdev_legacy_get_vfio_device(struct mdev_device *mdev);
/**
- struct mdev_parent_ops - Structure to be registered for each
parent device to diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 66dda06ec42d1b..748ec0e0293aea 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -178,11 +178,11 @@ enum vfio_notify_type { /* events for VFIO_GROUP_NOTIFY */ #define VFIO_GROUP_NOTIFY_SET_KVM BIT(0)
-extern int vfio_register_notifier(struct device *dev, +extern int vfio_register_notifier(struct vfio_device *dev, enum vfio_notify_type type, unsigned long *required_events, struct notifier_block *nb); -extern int vfio_unregister_notifier(struct device *dev, +extern int vfio_unregister_notifier(struct vfio_device *dev, enum vfio_notify_type type, struct notifier_block *nb);
On 4/12/22 11:53 AM, Jason Gunthorpe wrote:
All callers have a struct vfio_device trivially available, pass it in directly and avoid calling the expensive vfio_group_get_from_dev().
To support the unconverted kvmgt mdev driver add mdev_legacy_get_vfio_device() which will return the vfio_device pointer vfio_mdev.c puts in the drv_data.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com
drivers/gpu/drm/i915/gvt/kvmgt.c | 15 +++++++++------ drivers/s390/cio/vfio_ccw_ops.c | 7 +++---- drivers/s390/crypto/vfio_ap_ops.c | 14 +++++++------- drivers/vfio/mdev/vfio_mdev.c | 12 ++++++++++++ drivers/vfio/vfio.c | 25 +++++++------------------ include/linux/mdev.h | 1 + include/linux/vfio.h | 4 ++-- 7 files changed, 41 insertions(+), 37 deletions(-)
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 057ec449010458..bb59d21cf898ab 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -904,6 +904,7 @@ static int intel_vgpu_group_notifier(struct notifier_block *nb,
static int intel_vgpu_open_device(struct mdev_device *mdev) {
- struct vfio_device *vfio_dev = mdev_legacy_get_vfio_device(mdev); struct intel_vgpu *vgpu = mdev_get_drvdata(mdev); struct kvmgt_vdev *vdev = kvmgt_vdev(vgpu); unsigned long events;
@@ -914,7 +915,7 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) vdev->group_notifier.notifier_call = intel_vgpu_group_notifier;
events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
- ret = vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY, &events,
- ret = vfio_register_notifier(vfio_dev, VFIO_IOMMU_NOTIFY, &events, &vdev->iommu_notifier); if (ret != 0) { gvt_vgpu_err("vfio_register_notifier for iommu failed: %d\n",
@@ -923,7 +924,7 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) }
events = VFIO_GROUP_NOTIFY_SET_KVM;
- ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY, &events,
- ret = vfio_register_notifier(vfio_dev, VFIO_GROUP_NOTIFY, &events, &vdev->group_notifier); if (ret != 0) { gvt_vgpu_err("vfio_register_notifier for group failed: %d\n",
@@ -961,11 +962,11 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) vdev->vfio_group = NULL;
undo_register:
- vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
vfio_unregister_notifier(vfio_dev, VFIO_GROUP_NOTIFY, &vdev->group_notifier);
undo_iommu:
- vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
- vfio_unregister_notifier(vfio_dev, VFIO_IOMMU_NOTIFY, &vdev->iommu_notifier); out: return ret;
@@ -988,6 +989,7 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu) struct kvmgt_vdev *vdev = kvmgt_vdev(vgpu); struct drm_i915_private *i915 = vgpu->gvt->gt->i915; struct kvmgt_guest_info *info;
struct vfio_device *vfio_dev; int ret;
if (!handle_valid(vgpu->handle))
@@ -998,12 +1000,13 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu)
intel_gvt_ops->vgpu_release(vgpu);
- ret = vfio_unregister_notifier(mdev_dev(vdev->mdev), VFIO_IOMMU_NOTIFY,
- vfio_dev = mdev_legacy_get_vfio_device(vdev->mdev);
- ret = vfio_unregister_notifier(vfio_dev, VFIO_IOMMU_NOTIFY, &vdev->iommu_notifier); drm_WARN(&i915->drm, ret, "vfio_unregister_notifier for iommu failed: %d\n", ret);
- ret = vfio_unregister_notifier(mdev_dev(vdev->mdev), VFIO_GROUP_NOTIFY,
- ret = vfio_unregister_notifier(vfio_dev, VFIO_GROUP_NOTIFY, &vdev->group_notifier); drm_WARN(&i915->drm, ret, "vfio_unregister_notifier for group failed: %d\n", ret);
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c index d8589afac272f1..e1ce24d8fb2555 100644 --- a/drivers/s390/cio/vfio_ccw_ops.c +++ b/drivers/s390/cio/vfio_ccw_ops.c @@ -183,7 +183,7 @@ static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
private->nb.notifier_call = vfio_ccw_mdev_notifier;
- ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
- ret = vfio_register_notifier(vdev, VFIO_IOMMU_NOTIFY, &events, &private->nb); if (ret) return ret;
@@ -204,8 +204,7 @@ static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
out_unregister: vfio_ccw_unregister_dev_regions(private);
- vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
&private->nb);
- vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, &private->nb); return ret; }
@@ -223,7 +222,7 @@ static void vfio_ccw_mdev_close_device(struct vfio_device *vdev)
cp_free(&private->cp); vfio_ccw_unregister_dev_regions(private);
- vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY, &private->nb);
vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, &private->nb); }
static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private,
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 6e08d04b605d6e..69768061cd7bd9 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1406,21 +1406,21 @@ static int vfio_ap_mdev_open_device(struct vfio_device *vdev) matrix_mdev->group_notifier.notifier_call = vfio_ap_mdev_group_notifier; events = VFIO_GROUP_NOTIFY_SET_KVM;
- ret = vfio_register_notifier(vdev->dev, VFIO_GROUP_NOTIFY,
&events, &matrix_mdev->group_notifier);
ret = vfio_register_notifier(vdev, VFIO_GROUP_NOTIFY, &events,
&matrix_mdev->group_notifier);
if (ret) return ret;
matrix_mdev->iommu_notifier.notifier_call = vfio_ap_mdev_iommu_notifier; events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
- ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
&events, &matrix_mdev->iommu_notifier);
ret = vfio_register_notifier(vdev, VFIO_IOMMU_NOTIFY, &events,
&matrix_mdev->iommu_notifier);
if (ret) goto out_unregister_group; return 0;
out_unregister_group:
- vfio_unregister_notifier(vdev->dev, VFIO_GROUP_NOTIFY,
- vfio_unregister_notifier(vdev, VFIO_GROUP_NOTIFY, &matrix_mdev->group_notifier); return ret; }
@@ -1430,9 +1430,9 @@ static void vfio_ap_mdev_close_device(struct vfio_device *vdev) struct ap_matrix_mdev *matrix_mdev = container_of(vdev, struct ap_matrix_mdev, vdev);
- vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
- vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, &matrix_mdev->iommu_notifier);
- vfio_unregister_notifier(vdev->dev, VFIO_GROUP_NOTIFY,
- vfio_unregister_notifier(vdev, VFIO_GROUP_NOTIFY, &matrix_mdev->group_notifier); vfio_ap_mdev_unset_kvm(matrix_mdev); }
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c index a90e24b0c851d3..91605c1e8c8f94 100644 --- a/drivers/vfio/mdev/vfio_mdev.c +++ b/drivers/vfio/mdev/vfio_mdev.c @@ -17,6 +17,18 @@
#include "mdev_private.h"
+/*
- Return the struct vfio_device for the mdev when using the legacy
- vfio_mdev_dev_ops path. No new callers to this function should be added.
- */
+struct vfio_device *mdev_legacy_get_vfio_device(struct mdev_device *mdev) +{
- if (WARN_ON(mdev->dev.driver != &vfio_mdev_driver.driver))
return NULL;
- return dev_get_drvdata(&mdev->dev);
+} +EXPORT_SYMBOL_GPL(mdev_legacy_get_vfio_device);
- static int vfio_mdev_open_device(struct vfio_device *core_vdev) { struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index a4555014bd1e72..8a5c46aa2bef61 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -2484,19 +2484,15 @@ static int vfio_unregister_group_notifier(struct vfio_group *group, return ret; }
-int vfio_register_notifier(struct device *dev, enum vfio_notify_type type, +int vfio_register_notifier(struct vfio_device *dev, enum vfio_notify_type type, unsigned long *events, struct notifier_block *nb) {
- struct vfio_group *group;
- struct vfio_group *group = dev->group;
Is there a guarantee that dev != NULL? The original code below checks the value of dev, so why is that check eliminated here?
int ret;
- if (!dev || !nb || !events || (*events == 0))
- if (!nb || !events || (*events == 0)) return -EINVAL;
- group = vfio_group_get_from_dev(dev);
- if (!group)
return -ENODEV;
- switch (type) { case VFIO_IOMMU_NOTIFY: ret = vfio_register_iommu_notifier(group, events, nb);
@@ -2507,25 +2503,20 @@ int vfio_register_notifier(struct device *dev, enum vfio_notify_type type, default: ret = -EINVAL; }
- vfio_group_put(group); return ret; } EXPORT_SYMBOL(vfio_register_notifier);
-int vfio_unregister_notifier(struct device *dev, enum vfio_notify_type type, +int vfio_unregister_notifier(struct vfio_device *dev,
{enum vfio_notify_type type, struct notifier_block *nb)
- struct vfio_group *group;
- struct vfio_group *group = dev->group;
Same comment as above, not NULL check here.
int ret;
- if (!dev || !nb)
- if (!nb) return -EINVAL;
- group = vfio_group_get_from_dev(dev);
- if (!group)
return -ENODEV;
- switch (type) { case VFIO_IOMMU_NOTIFY: ret = vfio_unregister_iommu_notifier(group, nb);
@@ -2536,8 +2527,6 @@ int vfio_unregister_notifier(struct device *dev, enum vfio_notify_type type, default: ret = -EINVAL; }
- vfio_group_put(group); return ret; } EXPORT_SYMBOL(vfio_unregister_notifier);
diff --git a/include/linux/mdev.h b/include/linux/mdev.h index 15d03f6532d073..67d07220a28f29 100644 --- a/include/linux/mdev.h +++ b/include/linux/mdev.h @@ -29,6 +29,7 @@ static inline struct mdev_device *to_mdev_device(struct device *dev) unsigned int mdev_get_type_group_id(struct mdev_device *mdev); unsigned int mtype_get_type_group_id(struct mdev_type *mtype); struct device *mtype_get_parent_dev(struct mdev_type *mtype); +struct vfio_device *mdev_legacy_get_vfio_device(struct mdev_device *mdev);
/**
- struct mdev_parent_ops - Structure to be registered for each parent device to
diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 66dda06ec42d1b..748ec0e0293aea 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -178,11 +178,11 @@ enum vfio_notify_type { /* events for VFIO_GROUP_NOTIFY */ #define VFIO_GROUP_NOTIFY_SET_KVM BIT(0)
-extern int vfio_register_notifier(struct device *dev, +extern int vfio_register_notifier(struct vfio_device *dev, enum vfio_notify_type type, unsigned long *required_events, struct notifier_block *nb); -extern int vfio_unregister_notifier(struct device *dev, +extern int vfio_unregister_notifier(struct vfio_device *dev, enum vfio_notify_type type, struct notifier_block *nb);
On Mon, Apr 18, 2022 at 11:28:30AM -0400, Tony Krowiak wrote:
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index a4555014bd1e72..8a5c46aa2bef61 100644 +++ b/drivers/vfio/vfio.c @@ -2484,19 +2484,15 @@ static int vfio_unregister_group_notifier(struct vfio_group *group, return ret; } -int vfio_register_notifier(struct device *dev, enum vfio_notify_type type, +int vfio_register_notifier(struct vfio_device *dev, enum vfio_notify_type type, unsigned long *events, struct notifier_block *nb) {
- struct vfio_group *group;
- struct vfio_group *group = dev->group;
Is there a guarantee that dev != NULL? The original code below checks the value of dev, so why is that check eliminated here?
Yes, no kernel driver calls this with null dev. The original code should have been a WARN_ON as it is just protecting against a buggy driver. In this case if the driver is buggy we simply generate a backtrace through a null deref panic.
Jason
On 4/18/22 11:44 AM, Jason Gunthorpe wrote:
On Mon, Apr 18, 2022 at 11:28:30AM -0400, Tony Krowiak wrote:
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index a4555014bd1e72..8a5c46aa2bef61 100644 +++ b/drivers/vfio/vfio.c @@ -2484,19 +2484,15 @@ static int vfio_unregister_group_notifier(struct vfio_group *group, return ret; } -int vfio_register_notifier(struct device *dev, enum vfio_notify_type type, +int vfio_register_notifier(struct vfio_device *dev, enum vfio_notify_type type, unsigned long *events, struct notifier_block *nb) {
- struct vfio_group *group;
- struct vfio_group *group = dev->group;
Is there a guarantee that dev != NULL? The original code below checks the value of dev, so why is that check eliminated here?
Yes, no kernel driver calls this with null dev. The original code should have been a WARN_ON as it is just protecting against a buggy driver. In this case if the driver is buggy we simply generate a backtrace through a null deref panic.
Jason
Regarding the vfio_ap parts: Reviewed-by: Tony Krowiak akrowiak@linux.ibm.com
On 4/12/22 11:53, Jason Gunthorpe wrote:
All callers have a struct vfio_device trivially available, pass it in directly and avoid calling the expensive vfio_group_get_from_dev().
To support the unconverted kvmgt mdev driver add mdev_legacy_get_vfio_device() which will return the vfio_device pointer vfio_mdev.c puts in the drv_data.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com ... diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 6e08d04b605d6e..69768061cd7bd9 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1406,21 +1406,21 @@ static int vfio_ap_mdev_open_device(struct vfio_device *vdev) matrix_mdev->group_notifier.notifier_call = vfio_ap_mdev_group_notifier; events = VFIO_GROUP_NOTIFY_SET_KVM;
- ret = vfio_register_notifier(vdev->dev, VFIO_GROUP_NOTIFY,
&events, &matrix_mdev->group_notifier);
ret = vfio_register_notifier(vdev, VFIO_GROUP_NOTIFY, &events,
&matrix_mdev->group_notifier);
if (ret) return ret;
matrix_mdev->iommu_notifier.notifier_call = vfio_ap_mdev_iommu_notifier; events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
- ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
&events, &matrix_mdev->iommu_notifier);
ret = vfio_register_notifier(vdev, VFIO_IOMMU_NOTIFY, &events,
&matrix_mdev->iommu_notifier);
if (ret) goto out_unregister_group; return 0;
out_unregister_group:
- vfio_unregister_notifier(vdev->dev, VFIO_GROUP_NOTIFY,
- vfio_unregister_notifier(vdev, VFIO_GROUP_NOTIFY, &matrix_mdev->group_notifier); return ret; }
@@ -1430,9 +1430,9 @@ static void vfio_ap_mdev_close_device(struct vfio_device *vdev) struct ap_matrix_mdev *matrix_mdev = container_of(vdev, struct ap_matrix_mdev, vdev);
- vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
- vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, &matrix_mdev->iommu_notifier);
- vfio_unregister_notifier(vdev->dev, VFIO_GROUP_NOTIFY,
- vfio_unregister_notifier(vdev, VFIO_GROUP_NOTIFY, &matrix_mdev->group_notifier); vfio_ap_mdev_unset_kvm(matrix_mdev); }
looks good for vfio_ap. Reviewed-by: Jason J. Herne jjherne@linux.ibm.com
The next patch wants the vfio_device instead. There is no reason to store a pointer here since we can container_of back to the vfio_device.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- drivers/s390/cio/vfio_ccw_cp.c | 44 +++++++++++++++++++-------------- drivers/s390/cio/vfio_ccw_cp.h | 4 +-- drivers/s390/cio/vfio_ccw_fsm.c | 3 +-- 3 files changed, 28 insertions(+), 23 deletions(-)
diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c index 8d1b2771c1aa02..af5048a1ba8894 100644 --- a/drivers/s390/cio/vfio_ccw_cp.c +++ b/drivers/s390/cio/vfio_ccw_cp.c @@ -16,6 +16,7 @@ #include <asm/idals.h>
#include "vfio_ccw_cp.h" +#include "vfio_ccw_private.h"
struct pfn_array { /* Starting guest physical I/O address. */ @@ -98,17 +99,17 @@ static int pfn_array_alloc(struct pfn_array *pa, u64 iova, unsigned int len) * If the pin request partially succeeds, or fails completely, * all pages are left unpinned and a negative error value is returned. */ -static int pfn_array_pin(struct pfn_array *pa, struct device *mdev) +static int pfn_array_pin(struct pfn_array *pa, struct vfio_device *vdev) { int ret = 0;
- ret = vfio_pin_pages(mdev, pa->pa_iova_pfn, pa->pa_nr, + ret = vfio_pin_pages(vdev->dev, pa->pa_iova_pfn, pa->pa_nr, IOMMU_READ | IOMMU_WRITE, pa->pa_pfn);
if (ret < 0) { goto err_out; } else if (ret > 0 && ret != pa->pa_nr) { - vfio_unpin_pages(mdev, pa->pa_iova_pfn, ret); + vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, ret); ret = -EINVAL; goto err_out; } @@ -122,11 +123,11 @@ static int pfn_array_pin(struct pfn_array *pa, struct device *mdev) }
/* Unpin the pages before releasing the memory. */ -static void pfn_array_unpin_free(struct pfn_array *pa, struct device *mdev) +static void pfn_array_unpin_free(struct pfn_array *pa, struct vfio_device *vdev) { /* Only unpin if any pages were pinned to begin with */ if (pa->pa_nr) - vfio_unpin_pages(mdev, pa->pa_iova_pfn, pa->pa_nr); + vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, pa->pa_nr); pa->pa_nr = 0; kfree(pa->pa_iova_pfn); } @@ -190,7 +191,7 @@ static void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned long len) * Within the domain (@mdev), copy @n bytes from a guest physical * address (@iova) to a host physical address (@to). */ -static long copy_from_iova(struct device *mdev, +static long copy_from_iova(struct vfio_device *vdev, void *to, u64 iova, unsigned long n) { @@ -203,9 +204,9 @@ static long copy_from_iova(struct device *mdev, if (ret < 0) return ret;
- ret = pfn_array_pin(&pa, mdev); + ret = pfn_array_pin(&pa, vdev); if (ret < 0) { - pfn_array_unpin_free(&pa, mdev); + pfn_array_unpin_free(&pa, vdev); return ret; }
@@ -226,7 +227,7 @@ static long copy_from_iova(struct device *mdev, break; }
- pfn_array_unpin_free(&pa, mdev); + pfn_array_unpin_free(&pa, vdev);
return l; } @@ -423,11 +424,13 @@ static int ccwchain_loop_tic(struct ccwchain *chain,
static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp) { + struct vfio_device *vdev = + &container_of(cp, struct vfio_ccw_private, cp)->vdev; struct ccwchain *chain; int len, ret;
/* Copy 2K (the most we support today) of possible CCWs */ - len = copy_from_iova(cp->mdev, cp->guest_cp, cda, + len = copy_from_iova(vdev, cp->guest_cp, cda, CCWCHAIN_LEN_MAX * sizeof(struct ccw1)); if (len) return len; @@ -508,6 +511,8 @@ static int ccwchain_fetch_direct(struct ccwchain *chain, int idx, struct channel_program *cp) { + struct vfio_device *vdev = + &container_of(cp, struct vfio_ccw_private, cp)->vdev; struct ccw1 *ccw; struct pfn_array *pa; u64 iova; @@ -526,7 +531,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain, if (ccw_is_idal(ccw)) { /* Read first IDAW to see if it's 4K-aligned or not. */ /* All subsequent IDAws will be 4K-aligned. */ - ret = copy_from_iova(cp->mdev, &iova, ccw->cda, sizeof(iova)); + ret = copy_from_iova(vdev, &iova, ccw->cda, sizeof(iova)); if (ret) return ret; } else { @@ -555,7 +560,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
if (ccw_is_idal(ccw)) { /* Copy guest IDAL into host IDAL */ - ret = copy_from_iova(cp->mdev, idaws, ccw->cda, idal_len); + ret = copy_from_iova(vdev, idaws, ccw->cda, idal_len); if (ret) goto out_unpin;
@@ -574,7 +579,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain, }
if (ccw_does_data_transfer(ccw)) { - ret = pfn_array_pin(pa, cp->mdev); + ret = pfn_array_pin(pa, vdev); if (ret < 0) goto out_unpin; } else { @@ -590,7 +595,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain, return 0;
out_unpin: - pfn_array_unpin_free(pa, cp->mdev); + pfn_array_unpin_free(pa, vdev); out_free_idaws: kfree(idaws); out_init: @@ -632,8 +637,10 @@ static int ccwchain_fetch_one(struct ccwchain *chain, * Returns: * %0 on success and a negative error value on failure. */ -int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb) +int cp_init(struct channel_program *cp, union orb *orb) { + struct vfio_device *vdev = + &container_of(cp, struct vfio_ccw_private, cp)->vdev; /* custom ratelimit used to avoid flood during guest IPL */ static DEFINE_RATELIMIT_STATE(ratelimit_state, 5 * HZ, 1); int ret; @@ -650,11 +657,10 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb) * the problem if something does break. */ if (!orb->cmd.pfch && __ratelimit(&ratelimit_state)) - dev_warn(mdev, "Prefetching channel program even though prefetch not specified in ORB"); + dev_warn(vdev->dev, "Prefetching channel program even though prefetch not specified in ORB");
INIT_LIST_HEAD(&cp->ccwchain_list); memcpy(&cp->orb, orb, sizeof(*orb)); - cp->mdev = mdev;
/* Build a ccwchain for the first CCW segment */ ret = ccwchain_handle_ccw(orb->cmd.cpa, cp); @@ -682,6 +688,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb) */ void cp_free(struct channel_program *cp) { + struct vfio_device *vdev = + &container_of(cp, struct vfio_ccw_private, cp)->vdev; struct ccwchain *chain, *temp; int i;
@@ -691,7 +699,7 @@ void cp_free(struct channel_program *cp) cp->initialized = false; list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) { for (i = 0; i < chain->ch_len; i++) { - pfn_array_unpin_free(chain->ch_pa + i, cp->mdev); + pfn_array_unpin_free(chain->ch_pa + i, vdev); ccwchain_cda_free(chain, i); } ccwchain_free(chain); diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h index ba31240ce96594..e4c436199b4cda 100644 --- a/drivers/s390/cio/vfio_ccw_cp.h +++ b/drivers/s390/cio/vfio_ccw_cp.h @@ -37,13 +37,11 @@ struct channel_program { struct list_head ccwchain_list; union orb orb; - struct device *mdev; bool initialized; struct ccw1 *guest_cp; };
-extern int cp_init(struct channel_program *cp, struct device *mdev, - union orb *orb); +extern int cp_init(struct channel_program *cp, union orb *orb); extern void cp_free(struct channel_program *cp); extern int cp_prefetch(struct channel_program *cp); extern union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm); diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c index e435a9cd92dacf..8483a266051c21 100644 --- a/drivers/s390/cio/vfio_ccw_fsm.c +++ b/drivers/s390/cio/vfio_ccw_fsm.c @@ -262,8 +262,7 @@ static void fsm_io_request(struct vfio_ccw_private *private, errstr = "transport mode"; goto err_out; } - io_region->ret_code = cp_init(&private->cp, mdev_dev(mdev), - orb); + io_region->ret_code = cp_init(&private->cp, orb); if (io_region->ret_code) { VFIO_CCW_MSG_EVENT(2, "%pUl (%x.%x.%04x): cp_init=%d\n",
On Tue, 2022-04-12 at 12:53 -0300, Jason Gunthorpe wrote:
The next patch wants the vfio_device instead. There is no reason to store a pointer here since we can container_of back to the vfio_device.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com
drivers/s390/cio/vfio_ccw_cp.c | 44 +++++++++++++++++++----------
drivers/s390/cio/vfio_ccw_cp.h | 4 +-- drivers/s390/cio/vfio_ccw_fsm.c | 3 +-- 3 files changed, 28 insertions(+), 23 deletions(-)
There's opportunity for simplification here, but I'll handle that when I get to some other work in this space. For this series, this is fine.
Reviewed-by: Eric Farman farman@linux.ibm.com
diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c index 8d1b2771c1aa02..af5048a1ba8894 100644 --- a/drivers/s390/cio/vfio_ccw_cp.c +++ b/drivers/s390/cio/vfio_ccw_cp.c @@ -16,6 +16,7 @@ #include <asm/idals.h>
#include "vfio_ccw_cp.h" +#include "vfio_ccw_private.h"
struct pfn_array { /* Starting guest physical I/O address. */ @@ -98,17 +99,17 @@ static int pfn_array_alloc(struct pfn_array *pa, u64 iova, unsigned int len)
- If the pin request partially succeeds, or fails completely,
- all pages are left unpinned and a negative error value is
returned. */ -static int pfn_array_pin(struct pfn_array *pa, struct device *mdev) +static int pfn_array_pin(struct pfn_array *pa, struct vfio_device *vdev) { int ret = 0;
- ret = vfio_pin_pages(mdev, pa->pa_iova_pfn, pa->pa_nr,
ret = vfio_pin_pages(vdev->dev, pa->pa_iova_pfn, pa->pa_nr, IOMMU_READ | IOMMU_WRITE, pa->pa_pfn);
if (ret < 0) { goto err_out; } else if (ret > 0 && ret != pa->pa_nr) {
vfio_unpin_pages(mdev, pa->pa_iova_pfn, ret);
ret = -EINVAL; goto err_out; }vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, ret);
@@ -122,11 +123,11 @@ static int pfn_array_pin(struct pfn_array *pa, struct device *mdev) }
/* Unpin the pages before releasing the memory. */ -static void pfn_array_unpin_free(struct pfn_array *pa, struct device *mdev) +static void pfn_array_unpin_free(struct pfn_array *pa, struct vfio_device *vdev) { /* Only unpin if any pages were pinned to begin with */ if (pa->pa_nr)
vfio_unpin_pages(mdev, pa->pa_iova_pfn, pa->pa_nr);
vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, pa-
pa_nr);
pa->pa_nr = 0; kfree(pa->pa_iova_pfn); } @@ -190,7 +191,7 @@ static void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned long len)
- Within the domain (@mdev), copy @n bytes from a guest physical
- address (@iova) to a host physical address (@to).
*/ -static long copy_from_iova(struct device *mdev, +static long copy_from_iova(struct vfio_device *vdev, void *to, u64 iova, unsigned long n) { @@ -203,9 +204,9 @@ static long copy_from_iova(struct device *mdev, if (ret < 0) return ret;
- ret = pfn_array_pin(&pa, mdev);
- ret = pfn_array_pin(&pa, vdev); if (ret < 0) {
pfn_array_unpin_free(&pa, mdev);
return ret; }pfn_array_unpin_free(&pa, vdev);
@@ -226,7 +227,7 @@ static long copy_from_iova(struct device *mdev, break; }
- pfn_array_unpin_free(&pa, mdev);
pfn_array_unpin_free(&pa, vdev);
return l;
} @@ -423,11 +424,13 @@ static int ccwchain_loop_tic(struct ccwchain *chain,
static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp) {
struct vfio_device *vdev =
&container_of(cp, struct vfio_ccw_private, cp)->vdev;
struct ccwchain *chain; int len, ret;
/* Copy 2K (the most we support today) of possible CCWs */
- len = copy_from_iova(cp->mdev, cp->guest_cp, cda,
- len = copy_from_iova(vdev, cp->guest_cp, cda, CCWCHAIN_LEN_MAX * sizeof(struct ccw1)); if (len) return len;
@@ -508,6 +511,8 @@ static int ccwchain_fetch_direct(struct ccwchain *chain, int idx, struct channel_program *cp) {
- struct vfio_device *vdev =
struct ccw1 *ccw; struct pfn_array *pa; u64 iova;&container_of(cp, struct vfio_ccw_private, cp)->vdev;
@@ -526,7 +531,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain, if (ccw_is_idal(ccw)) { /* Read first IDAW to see if it's 4K-aligned or not. */ /* All subsequent IDAws will be 4K-aligned. */
ret = copy_from_iova(cp->mdev, &iova, ccw->cda,
sizeof(iova));
ret = copy_from_iova(vdev, &iova, ccw->cda,
sizeof(iova)); if (ret) return ret; } else { @@ -555,7 +560,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
if (ccw_is_idal(ccw)) { /* Copy guest IDAL into host IDAL */
ret = copy_from_iova(cp->mdev, idaws, ccw->cda,
idal_len);
if (ret) goto out_unpin;ret = copy_from_iova(vdev, idaws, ccw->cda, idal_len);
@@ -574,7 +579,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain, }
if (ccw_does_data_transfer(ccw)) {
ret = pfn_array_pin(pa, cp->mdev);
if (ret < 0) goto out_unpin; } else {ret = pfn_array_pin(pa, vdev);
@@ -590,7 +595,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain, return 0;
out_unpin:
- pfn_array_unpin_free(pa, cp->mdev);
- pfn_array_unpin_free(pa, vdev);
out_free_idaws: kfree(idaws); out_init: @@ -632,8 +637,10 @@ static int ccwchain_fetch_one(struct ccwchain *chain,
- Returns:
- %0 on success and a negative error value on failure.
*/ -int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb) +int cp_init(struct channel_program *cp, union orb *orb) {
- struct vfio_device *vdev =
/* custom ratelimit used to avoid flood during guest IPL */ static DEFINE_RATELIMIT_STATE(ratelimit_state, 5 * HZ, 1); int ret;&container_of(cp, struct vfio_ccw_private, cp)->vdev;
@@ -650,11 +657,10 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb) * the problem if something does break. */ if (!orb->cmd.pfch && __ratelimit(&ratelimit_state))
dev_warn(mdev, "Prefetching channel program even though
prefetch not specified in ORB");
dev_warn(vdev->dev, "Prefetching channel program even
though prefetch not specified in ORB");
INIT_LIST_HEAD(&cp->ccwchain_list); memcpy(&cp->orb, orb, sizeof(*orb));
cp->mdev = mdev;
/* Build a ccwchain for the first CCW segment */ ret = ccwchain_handle_ccw(orb->cmd.cpa, cp);
@@ -682,6 +688,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb) */ void cp_free(struct channel_program *cp) {
- struct vfio_device *vdev =
struct ccwchain *chain, *temp; int i;&container_of(cp, struct vfio_ccw_private, cp)->vdev;
@@ -691,7 +699,7 @@ void cp_free(struct channel_program *cp) cp->initialized = false; list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) { for (i = 0; i < chain->ch_len; i++) {
pfn_array_unpin_free(chain->ch_pa + i, cp-
mdev);
} ccwchain_free(chain);pfn_array_unpin_free(chain->ch_pa + i, vdev); ccwchain_cda_free(chain, i);
diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h index ba31240ce96594..e4c436199b4cda 100644 --- a/drivers/s390/cio/vfio_ccw_cp.h +++ b/drivers/s390/cio/vfio_ccw_cp.h @@ -37,13 +37,11 @@ struct channel_program { struct list_head ccwchain_list; union orb orb;
- struct device *mdev; bool initialized; struct ccw1 *guest_cp;
};
-extern int cp_init(struct channel_program *cp, struct device *mdev,
union orb *orb);
+extern int cp_init(struct channel_program *cp, union orb *orb); extern void cp_free(struct channel_program *cp); extern int cp_prefetch(struct channel_program *cp); extern union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm); diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c index e435a9cd92dacf..8483a266051c21 100644 --- a/drivers/s390/cio/vfio_ccw_fsm.c +++ b/drivers/s390/cio/vfio_ccw_fsm.c @@ -262,8 +262,7 @@ static void fsm_io_request(struct vfio_ccw_private *private, errstr = "transport mode"; goto err_out; }
io_region->ret_code = cp_init(&private->cp,
mdev_dev(mdev),
orb);
if (io_region->ret_code) { VFIO_CCW_MSG_EVENT(2, "%pUl (%x.%x.%04x):io_region->ret_code = cp_init(&private->cp, orb);
cp_init=%d\n",
Every caller has a readily available vfio_device pointer, use that instead of passing in a generic struct device. The struct vfio_device already contains the group we need so this avoids complexity, extra refcountings, and a confusing lifecycle model.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- .../driver-api/vfio-mediated-device.rst | 4 +- drivers/s390/cio/vfio_ccw_cp.c | 6 +-- drivers/s390/crypto/vfio_ap_ops.c | 8 ++-- drivers/vfio/vfio.c | 40 ++++++------------- include/linux/vfio.h | 4 +- 5 files changed, 24 insertions(+), 38 deletions(-)
diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst index 9f26079cacae35..6aeca741dc9be1 100644 --- a/Documentation/driver-api/vfio-mediated-device.rst +++ b/Documentation/driver-api/vfio-mediated-device.rst @@ -279,10 +279,10 @@ Translation APIs for Mediated Devices The following APIs are provided for translating user pfn to host pfn in a VFIO driver::
- extern int vfio_pin_pages(struct device *dev, unsigned long *user_pfn, + extern int vfio_pin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage, int prot, unsigned long *phys_pfn);
- extern int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn, + extern int vfio_unpin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage);
These functions call back into the back-end IOMMU module by using the pin_pages diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c index af5048a1ba8894..e362cb962a7234 100644 --- a/drivers/s390/cio/vfio_ccw_cp.c +++ b/drivers/s390/cio/vfio_ccw_cp.c @@ -103,13 +103,13 @@ static int pfn_array_pin(struct pfn_array *pa, struct vfio_device *vdev) { int ret = 0;
- ret = vfio_pin_pages(vdev->dev, pa->pa_iova_pfn, pa->pa_nr, + ret = vfio_pin_pages(vdev, pa->pa_iova_pfn, pa->pa_nr, IOMMU_READ | IOMMU_WRITE, pa->pa_pfn);
if (ret < 0) { goto err_out; } else if (ret > 0 && ret != pa->pa_nr) { - vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, ret); + vfio_unpin_pages(vdev, pa->pa_iova_pfn, ret); ret = -EINVAL; goto err_out; } @@ -127,7 +127,7 @@ static void pfn_array_unpin_free(struct pfn_array *pa, struct vfio_device *vdev) { /* Only unpin if any pages were pinned to begin with */ if (pa->pa_nr) - vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, pa->pa_nr); + vfio_unpin_pages(vdev, pa->pa_iova_pfn, pa->pa_nr); pa->pa_nr = 0; kfree(pa->pa_iova_pfn); } diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 69768061cd7bd9..a10b3369d76c41 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -124,7 +124,7 @@ static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q) q->saved_isc = VFIO_AP_ISC_INVALID; } if (q->saved_pfn && !WARN_ON(!q->matrix_mdev)) { - vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev), + vfio_unpin_pages(&q->matrix_mdev->vdev, &q->saved_pfn, 1); q->saved_pfn = 0; } @@ -258,7 +258,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q, return status; }
- ret = vfio_pin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1, + ret = vfio_pin_pages(&q->matrix_mdev->vdev, &g_pfn, 1, IOMMU_READ | IOMMU_WRITE, &h_pfn); switch (ret) { case 1: @@ -301,7 +301,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q, break; case AP_RESPONSE_OTHERWISE_CHANGED: /* We could not modify IRQ setings: clear new configuration */ - vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1); + vfio_unpin_pages(&q->matrix_mdev->vdev, &g_pfn, 1); kvm_s390_gisc_unregister(kvm, isc); break; default: @@ -1250,7 +1250,7 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb, struct vfio_iommu_type1_dma_unmap *unmap = data; unsigned long g_pfn = unmap->iova >> PAGE_SHIFT;
- vfio_unpin_pages(mdev_dev(matrix_mdev->mdev), &g_pfn, 1); + vfio_unpin_pages(&matrix_mdev->vdev, &g_pfn, 1); return NOTIFY_OK; }
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 8a5c46aa2bef61..24b92a45cfc8f1 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -2142,32 +2142,26 @@ EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare); * @phys_pfn[out]: array of host PFNs * Return error or number of pages pinned. */ -int vfio_pin_pages(struct device *dev, unsigned long *user_pfn, int npage, +int vfio_pin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage, int prot, unsigned long *phys_pfn) { struct vfio_container *container; - struct vfio_group *group; + struct vfio_group *group = vdev->group; struct vfio_iommu_driver *driver; int ret;
- if (!dev || !user_pfn || !phys_pfn || !npage) + if (!user_pfn || !phys_pfn || !npage) return -EINVAL;
if (npage > VFIO_PIN_PAGES_MAX_ENTRIES) return -E2BIG;
- group = vfio_group_get_from_dev(dev); - if (!group) - return -ENODEV; - - if (group->dev_counter > 1) { - ret = -EINVAL; - goto err_pin_pages; - } + if (group->dev_counter > 1) + return -EINVAL;
ret = vfio_group_add_container_user(group); if (ret) - goto err_pin_pages; + return ret;
container = group->container; driver = container->iommu_driver; @@ -2180,8 +2174,6 @@ int vfio_pin_pages(struct device *dev, unsigned long *user_pfn, int npage,
vfio_group_try_dissolve_container(group);
-err_pin_pages: - vfio_group_put(group); return ret; } EXPORT_SYMBOL(vfio_pin_pages); @@ -2195,28 +2187,24 @@ EXPORT_SYMBOL(vfio_pin_pages); * be greater than VFIO_PIN_PAGES_MAX_ENTRIES. * Return error or number of pages unpinned. */ -int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn, int npage) +int vfio_unpin_pages(struct vfio_device *vdev, unsigned long *user_pfn, + int npage) { struct vfio_container *container; - struct vfio_group *group; struct vfio_iommu_driver *driver; int ret;
- if (!dev || !user_pfn || !npage) + if (!user_pfn || !npage) return -EINVAL;
if (npage > VFIO_PIN_PAGES_MAX_ENTRIES) return -E2BIG;
- group = vfio_group_get_from_dev(dev); - if (!group) - return -ENODEV; - - ret = vfio_group_add_container_user(group); + ret = vfio_group_add_container_user(vdev->group); if (ret) - goto err_unpin_pages; + return ret;
- container = group->container; + container = vdev->group->container; driver = container->iommu_driver; if (likely(driver && driver->ops->unpin_pages)) ret = driver->ops->unpin_pages(container->iommu_data, user_pfn, @@ -2224,10 +2212,8 @@ int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn, int npage) else ret = -ENOTTY;
- vfio_group_try_dissolve_container(group); + vfio_group_try_dissolve_container(vdev->group);
-err_unpin_pages: - vfio_group_put(group); return ret; } EXPORT_SYMBOL(vfio_unpin_pages); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 748ec0e0293aea..8f2a09801a660b 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -150,9 +150,9 @@ extern long vfio_external_check_extension(struct vfio_group *group,
#define VFIO_PIN_PAGES_MAX_ENTRIES (PAGE_SIZE/sizeof(unsigned long))
-extern int vfio_pin_pages(struct device *dev, unsigned long *user_pfn, +extern int vfio_pin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage, int prot, unsigned long *phys_pfn); -extern int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn, +extern int vfio_unpin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage);
extern int vfio_group_pin_pages(struct vfio_group *group,
On Tue, 2022-04-12 at 12:53 -0300, Jason Gunthorpe wrote:
Every caller has a readily available vfio_device pointer, use that instead of passing in a generic struct device. The struct vfio_device already contains the group we need so this avoids complexity, extra refcountings, and a confusing lifecycle model.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com
.../driver-api/vfio-mediated-device.rst | 4 +- drivers/s390/cio/vfio_ccw_cp.c | 6 +-- drivers/s390/crypto/vfio_ap_ops.c | 8 ++-- drivers/vfio/vfio.c | 40 ++++++----------- -- include/linux/vfio.h | 4 +- 5 files changed, 24 insertions(+), 38 deletions(-)
For the -ccw bits:
Acked-by: Eric Farman farman@linux.ibm.com
diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst index 9f26079cacae35..6aeca741dc9be1 100644 --- a/Documentation/driver-api/vfio-mediated-device.rst +++ b/Documentation/driver-api/vfio-mediated-device.rst @@ -279,10 +279,10 @@ Translation APIs for Mediated Devices The following APIs are provided for translating user pfn to host pfn in a VFIO driver::
- extern int vfio_pin_pages(struct device *dev, unsigned long
*user_pfn,
- extern int vfio_pin_pages(struct vfio_device *vdev, unsigned
long *user_pfn, int npage, int prot, unsigned long *phys_pfn);
- extern int vfio_unpin_pages(struct device *dev, unsigned long
*user_pfn,
- extern int vfio_unpin_pages(struct vfio_device *vdev, unsigned
long *user_pfn, int npage);
These functions call back into the back-end IOMMU module by using the pin_pages diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c index af5048a1ba8894..e362cb962a7234 100644 --- a/drivers/s390/cio/vfio_ccw_cp.c +++ b/drivers/s390/cio/vfio_ccw_cp.c @@ -103,13 +103,13 @@ static int pfn_array_pin(struct pfn_array *pa, struct vfio_device *vdev) { int ret = 0;
- ret = vfio_pin_pages(vdev->dev, pa->pa_iova_pfn, pa->pa_nr,
ret = vfio_pin_pages(vdev, pa->pa_iova_pfn, pa->pa_nr, IOMMU_READ | IOMMU_WRITE, pa->pa_pfn);
if (ret < 0) { goto err_out; } else if (ret > 0 && ret != pa->pa_nr) {
vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, ret);
ret = -EINVAL; goto err_out; }vfio_unpin_pages(vdev, pa->pa_iova_pfn, ret);
@@ -127,7 +127,7 @@ static void pfn_array_unpin_free(struct pfn_array *pa, struct vfio_device *vdev) { /* Only unpin if any pages were pinned to begin with */ if (pa->pa_nr)
vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, pa-
pa_nr);
pa->pa_nr = 0; kfree(pa->pa_iova_pfn);vfio_unpin_pages(vdev, pa->pa_iova_pfn, pa->pa_nr);
} diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 69768061cd7bd9..a10b3369d76c41 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -124,7 +124,7 @@ static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q) q->saved_isc = VFIO_AP_ISC_INVALID; } if (q->saved_pfn && !WARN_ON(!q->matrix_mdev)) {
vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
q->saved_pfn = 0; }vfio_unpin_pages(&q->matrix_mdev->vdev, &q->saved_pfn, 1);
@@ -258,7 +258,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q, return status; }
- ret = vfio_pin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1,
- ret = vfio_pin_pages(&q->matrix_mdev->vdev, &g_pfn, 1, IOMMU_READ | IOMMU_WRITE, &h_pfn); switch (ret) { case 1:
@@ -301,7 +301,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q, break; case AP_RESPONSE_OTHERWISE_CHANGED: /* We could not modify IRQ setings: clear new configuration */
vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
&g_pfn, 1);
kvm_s390_gisc_unregister(kvm, isc); break; default:vfio_unpin_pages(&q->matrix_mdev->vdev, &g_pfn, 1);
@@ -1250,7 +1250,7 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb, struct vfio_iommu_type1_dma_unmap *unmap = data; unsigned long g_pfn = unmap->iova >> PAGE_SHIFT;
vfio_unpin_pages(mdev_dev(matrix_mdev->mdev), &g_pfn,
1);
return NOTIFY_OK; }vfio_unpin_pages(&matrix_mdev->vdev, &g_pfn, 1);
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 8a5c46aa2bef61..24b92a45cfc8f1 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -2142,32 +2142,26 @@ EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare);
- @phys_pfn[out]: array of host PFNs
- Return error or number of pages pinned.
*/ -int vfio_pin_pages(struct device *dev, unsigned long *user_pfn, int npage, +int vfio_pin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage, int prot, unsigned long *phys_pfn) { struct vfio_container *container;
- struct vfio_group *group;
- struct vfio_group *group = vdev->group; struct vfio_iommu_driver *driver; int ret;
- if (!dev || !user_pfn || !phys_pfn || !npage)
if (!user_pfn || !phys_pfn || !npage) return -EINVAL;
if (npage > VFIO_PIN_PAGES_MAX_ENTRIES) return -E2BIG;
- group = vfio_group_get_from_dev(dev);
- if (!group)
return -ENODEV;
- if (group->dev_counter > 1) {
ret = -EINVAL;
goto err_pin_pages;
- }
if (group->dev_counter > 1)
return -EINVAL;
ret = vfio_group_add_container_user(group); if (ret)
goto err_pin_pages;
return ret;
container = group->container; driver = container->iommu_driver;
@@ -2180,8 +2174,6 @@ int vfio_pin_pages(struct device *dev, unsigned long *user_pfn, int npage,
vfio_group_try_dissolve_container(group);
-err_pin_pages:
- vfio_group_put(group); return ret;
} EXPORT_SYMBOL(vfio_pin_pages); @@ -2195,28 +2187,24 @@ EXPORT_SYMBOL(vfio_pin_pages);
be greater than VFIO_PIN_PAGES_MAX_ENTRIES.
- Return error or number of pages unpinned.
*/ -int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn, int npage) +int vfio_unpin_pages(struct vfio_device *vdev, unsigned long *user_pfn,
int npage)
{ struct vfio_container *container;
struct vfio_group *group; struct vfio_iommu_driver *driver; int ret;
if (!dev || !user_pfn || !npage)
if (!user_pfn || !npage) return -EINVAL;
if (npage > VFIO_PIN_PAGES_MAX_ENTRIES) return -E2BIG;
- group = vfio_group_get_from_dev(dev);
- if (!group)
return -ENODEV;
- ret = vfio_group_add_container_user(group);
- ret = vfio_group_add_container_user(vdev->group); if (ret)
goto err_unpin_pages;
return ret;
- container = group->container;
- container = vdev->group->container; driver = container->iommu_driver; if (likely(driver && driver->ops->unpin_pages)) ret = driver->ops->unpin_pages(container->iommu_data,
user_pfn, @@ -2224,10 +2212,8 @@ int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn, int npage) else ret = -ENOTTY;
- vfio_group_try_dissolve_container(group);
- vfio_group_try_dissolve_container(vdev->group);
-err_unpin_pages:
- vfio_group_put(group); return ret;
} EXPORT_SYMBOL(vfio_unpin_pages); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 748ec0e0293aea..8f2a09801a660b 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -150,9 +150,9 @@ extern long vfio_external_check_extension(struct vfio_group *group,
#define VFIO_PIN_PAGES_MAX_ENTRIES (PAGE_SIZE/sizeof(unsigned long))
-extern int vfio_pin_pages(struct device *dev, unsigned long *user_pfn, +extern int vfio_pin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage, int prot, unsigned long *phys_pfn); -extern int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn, +extern int vfio_unpin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage);
extern int vfio_group_pin_pages(struct vfio_group *group,
On 4/12/22 11:53, Jason Gunthorpe wrote:
Every caller has a readily available vfio_device pointer, use that instead of passing in a generic struct device. The struct vfio_device already contains the group we need so this avoids complexity, extra refcountings, and a confusing lifecycle model. ... diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 69768061cd7bd9..a10b3369d76c41 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -124,7 +124,7 @@ static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q) q->saved_isc = VFIO_AP_ISC_INVALID; } if (q->saved_pfn && !WARN_ON(!q->matrix_mdev)) {
vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
vfio_unpin_pages(&q->matrix_mdev->vdev, &q->saved_pfn, 1);
Could be contracted to a single line. If you feel like it :)
q->saved_pfn = 0;
} @@ -258,7 +258,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q, return status; }
- ret = vfio_pin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1,
- ret = vfio_pin_pages(&q->matrix_mdev->vdev, &g_pfn, 1, IOMMU_READ | IOMMU_WRITE, &h_pfn); switch (ret) { case 1:
@@ -301,7 +301,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q, break; case AP_RESPONSE_OTHERWISE_CHANGED: /* We could not modify IRQ setings: clear new configuration */
vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1);
kvm_s390_gisc_unregister(kvm, isc); break; default:vfio_unpin_pages(&q->matrix_mdev->vdev, &g_pfn, 1);
@@ -1250,7 +1250,7 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb, struct vfio_iommu_type1_dma_unmap *unmap = data; unsigned long g_pfn = unmap->iova >> PAGE_SHIFT;
vfio_unpin_pages(mdev_dev(matrix_mdev->mdev), &g_pfn, 1);
return NOTIFY_OK; }vfio_unpin_pages(&matrix_mdev->vdev, &g_pfn, 1);
Looks good to me. Reviewed-by: Jason J. Herne jjherne@linux.ibm.com
On Mon, Apr 18, 2022 at 11:25:15AM -0400, Jason J. Herne wrote:
On 4/12/22 11:53, Jason Gunthorpe wrote:
Every caller has a readily available vfio_device pointer, use that instead of passing in a generic struct device. The struct vfio_device already contains the group we need so this avoids complexity, extra refcountings, and a confusing lifecycle model. ... diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 69768061cd7bd9..a10b3369d76c41 100644 +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -124,7 +124,7 @@ static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q) q->saved_isc = VFIO_AP_ISC_INVALID; } if (q->saved_pfn && !WARN_ON(!q->matrix_mdev)) {
vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
vfio_unpin_pages(&q->matrix_mdev->vdev, &q->saved_pfn, 1);
Could be contracted to a single line. If you feel like it :)
Done, thanks
Jason
On 4/12/22 11:53 AM, Jason Gunthorpe wrote:
Every caller has a readily available vfio_device pointer, use that instead of passing in a generic struct device. The struct vfio_device already contains the group we need so this avoids complexity, extra refcountings, and a confusing lifecycle model.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com
.../driver-api/vfio-mediated-device.rst | 4 +- drivers/s390/cio/vfio_ccw_cp.c | 6 +-- drivers/s390/crypto/vfio_ap_ops.c | 8 ++-- drivers/vfio/vfio.c | 40 ++++++------------- include/linux/vfio.h | 4 +- 5 files changed, 24 insertions(+), 38 deletions(-)
diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst index 9f26079cacae35..6aeca741dc9be1 100644 --- a/Documentation/driver-api/vfio-mediated-device.rst +++ b/Documentation/driver-api/vfio-mediated-device.rst @@ -279,10 +279,10 @@ Translation APIs for Mediated Devices The following APIs are provided for translating user pfn to host pfn in a VFIO driver::
- extern int vfio_pin_pages(struct device *dev, unsigned long *user_pfn,
- extern int vfio_pin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage, int prot, unsigned long *phys_pfn);
- extern int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn,
extern int vfio_unpin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage);
These functions call back into the back-end IOMMU module by using the pin_pages
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 69768061cd7bd9..a10b3369d76c41 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -124,7 +124,7 @@ static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q) q->saved_isc = VFIO_AP_ISC_INVALID; } if (q->saved_pfn && !WARN_ON(!q->matrix_mdev)) {
vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
q->saved_pfn = 0; }vfio_unpin_pages(&q->matrix_mdev->vdev, &q->saved_pfn, 1);
@@ -258,7 +258,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q, return status; }
- ret = vfio_pin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1,
- ret = vfio_pin_pages(&q->matrix_mdev->vdev, &g_pfn, 1, IOMMU_READ | IOMMU_WRITE, &h_pfn); switch (ret) { case 1:
@@ -301,7 +301,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q, break; case AP_RESPONSE_OTHERWISE_CHANGED: /* We could not modify IRQ setings: clear new configuration */
vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1);
kvm_s390_gisc_unregister(kvm, isc); break; default:vfio_unpin_pages(&q->matrix_mdev->vdev, &g_pfn, 1);
@@ -1250,7 +1250,7 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb, struct vfio_iommu_type1_dma_unmap *unmap = data; unsigned long g_pfn = unmap->iova >> PAGE_SHIFT;
vfio_unpin_pages(mdev_dev(matrix_mdev->mdev), &g_pfn, 1);
return NOTIFY_OK; }vfio_unpin_pages(&matrix_mdev->vdev, &g_pfn, 1);
The vfio_ap snippet: Reviewed-by: Tony Krowiak akrowiak@linux.ibm.com
Use the existing vfio_device versions of vfio_(un)pin_pages(). There is no reason to use a group interface here, kvmgt has easy access to a vfio_device.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- drivers/gpu/drm/i915/gvt/kvmgt.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index bb59d21cf898ab..df7d87409e3a9c 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -268,6 +268,7 @@ static void gvt_unpin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, { struct drm_i915_private *i915 = vgpu->gvt->gt->i915; struct kvmgt_vdev *vdev = kvmgt_vdev(vgpu); + struct vfio_device *vfio_dev = mdev_legacy_get_vfio_device(vdev->mdev); int total_pages; int npage; int ret; @@ -277,7 +278,7 @@ static void gvt_unpin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, for (npage = 0; npage < total_pages; npage++) { unsigned long cur_gfn = gfn + npage;
- ret = vfio_group_unpin_pages(vdev->vfio_group, &cur_gfn, 1); + ret = vfio_unpin_pages(vfio_dev, &cur_gfn, 1); drm_WARN_ON(&i915->drm, ret != 1); } } @@ -287,6 +288,7 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, unsigned long size, struct page **page) { struct kvmgt_vdev *vdev = kvmgt_vdev(vgpu); + struct vfio_device *vfio_dev = mdev_legacy_get_vfio_device(vdev->mdev); unsigned long base_pfn = 0; int total_pages; int npage; @@ -301,8 +303,8 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, unsigned long cur_gfn = gfn + npage; unsigned long pfn;
- ret = vfio_group_pin_pages(vdev->vfio_group, &cur_gfn, 1, - IOMMU_READ | IOMMU_WRITE, &pfn); + ret = vfio_pin_pages(vfio_dev, &cur_gfn, 1, + IOMMU_READ | IOMMU_WRITE, &pfn); if (ret != 1) { gvt_vgpu_err("vfio_pin_pages failed for gfn 0x%lx, ret %d\n", cur_gfn, ret);
Every caller has a readily available vfio_device pointer, use that instead of passing in a generic struct device. The struct vfio_device already contains the group we need so this avoids complexity, extra refcountings, and a confusing lifecycle model.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- drivers/gpu/drm/i915/gvt/kvmgt.c | 5 +++-- drivers/vfio/vfio.c | 22 ++++++++++------------ include/linux/vfio.h | 2 +- 3 files changed, 14 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index df7d87409e3a9c..3302d5d4d92146 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -2184,8 +2184,9 @@ static int kvmgt_rw_gpa(unsigned long handle, unsigned long gpa,
info = (struct kvmgt_guest_info *)handle;
- return vfio_dma_rw(kvmgt_vdev(info->vgpu)->vfio_group, - gpa, buf, len, write); + return vfio_dma_rw( + mdev_legacy_get_vfio_device(kvmgt_vdev(info->vgpu)->mdev), + gpa, buf, len, write); }
static int kvmgt_read_gpa(unsigned long handle, unsigned long gpa, diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 24b92a45cfc8f1..e6e102e017623b 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -2323,32 +2323,28 @@ EXPORT_SYMBOL(vfio_group_unpin_pages); * As the read/write of user space memory is conducted via the CPUs and is * not a real device DMA, it is not necessary to pin the user space memory. * - * The caller needs to call vfio_group_get_external_user() or - * vfio_group_get_external_user_from_dev() prior to calling this interface, - * so as to prevent the VFIO group from disposal in the middle of the call. - * But it can keep the reference to the VFIO group for several calls into - * this interface. - * After finishing using of the VFIO group, the caller needs to release the - * VFIO group by calling vfio_group_put_external_user(). - * - * @group [in] : VFIO group + * @vdev [in] : VFIO device * @user_iova [in] : base IOVA of a user space buffer * @data [in] : pointer to kernel buffer * @len [in] : kernel buffer length * @write : indicate read or write * Return error code on failure or 0 on success. */ -int vfio_dma_rw(struct vfio_group *group, dma_addr_t user_iova, +int vfio_dma_rw(struct vfio_device *vdev, dma_addr_t user_iova, void *data, size_t len, bool write) { struct vfio_container *container; struct vfio_iommu_driver *driver; int ret = 0;
- if (!group || !data || len <= 0) + if (!data || len <= 0) return -EINVAL;
- container = group->container; + ret = vfio_group_add_container_user(vdev->group); + if (ret) + return ret; + + container = vdev->group->container; driver = container->iommu_driver;
if (likely(driver && driver->ops->dma_rw)) @@ -2357,6 +2353,8 @@ int vfio_dma_rw(struct vfio_group *group, dma_addr_t user_iova, else ret = -ENOTTY;
+ vfio_group_try_dissolve_container(vdev->group); + return ret; } EXPORT_SYMBOL(vfio_dma_rw); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 8f2a09801a660b..91d46e532ca104 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -161,7 +161,7 @@ extern int vfio_group_pin_pages(struct vfio_group *group, extern int vfio_group_unpin_pages(struct vfio_group *group, unsigned long *user_iova_pfn, int npage);
-extern int vfio_dma_rw(struct vfio_group *group, dma_addr_t user_iova, +extern int vfio_dma_rw(struct vfio_device *vdev, dma_addr_t user_iova, void *data, size_t len, bool write);
extern struct iommu_domain *vfio_group_iommu_domain(struct vfio_group *group);
try_module_get() must be undone if kvmgt_guest_init() fails or we leak the module reference count on the failure path since the close_device op is never called in this case.
Fixes: 9bdb073464d6 ("drm/i915/gvt: Change KVMGT as self load module") Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- drivers/gpu/drm/i915/gvt/kvmgt.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 3302d5d4d92146..d7c22a2601f3ad 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -952,13 +952,16 @@ static int intel_vgpu_open_device(struct mdev_device *mdev)
ret = kvmgt_guest_init(mdev); if (ret) - goto undo_group; + goto undo_module_get;
intel_gvt_ops->vgpu_activate(vgpu);
atomic_set(&vdev->released, 0); return ret;
+undo_module_get: + module_put(THIS_MODULE); + undo_group: vfio_group_put_external_user(vdev->vfio_group); vdev->vfio_group = NULL;
Nothing references this struct member any more, delete it completely.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- drivers/gpu/drm/i915/gvt/kvmgt.c | 17 +---------------- 1 file changed, 1 insertion(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index d7c22a2601f3ad..b15dbe9ecd7e15 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -133,7 +133,6 @@ struct kvmgt_vdev { struct work_struct release_work; atomic_t released; struct vfio_device *vfio_device; - struct vfio_group *vfio_group; };
static inline struct kvmgt_vdev *kvmgt_vdev(struct intel_vgpu *vgpu) @@ -911,7 +910,6 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) struct kvmgt_vdev *vdev = kvmgt_vdev(vgpu); unsigned long events; int ret; - struct vfio_group *vfio_group;
vdev->iommu_notifier.notifier_call = intel_vgpu_iommu_notifier; vdev->group_notifier.notifier_call = intel_vgpu_group_notifier; @@ -934,20 +932,12 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) goto undo_iommu; }
- vfio_group = vfio_group_get_external_user_from_dev(mdev_dev(mdev)); - if (IS_ERR_OR_NULL(vfio_group)) { - ret = !vfio_group ? -EFAULT : PTR_ERR(vfio_group); - gvt_vgpu_err("vfio_group_get_external_user_from_dev failed\n"); - goto undo_register; - } - vdev->vfio_group = vfio_group; - /* Take a module reference as mdev core doesn't take * a reference for vendor driver. */ if (!try_module_get(THIS_MODULE)) { ret = -ENODEV; - goto undo_group; + goto undo_register; }
ret = kvmgt_guest_init(mdev); @@ -962,10 +952,6 @@ static int intel_vgpu_open_device(struct mdev_device *mdev) undo_module_get: module_put(THIS_MODULE);
-undo_group: - vfio_group_put_external_user(vdev->vfio_group); - vdev->vfio_group = NULL; - undo_register: vfio_unregister_notifier(vfio_dev, VFIO_GROUP_NOTIFY, &vdev->group_notifier); @@ -1023,7 +1009,6 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu) kvmgt_guest_exit(info);
intel_vgpu_release_msi_eventfd_ctx(vgpu); - vfio_group_put_external_user(vdev->vfio_group);
vdev->kvm = NULL; vgpu->handle = 0;
Now that callers have been updated to use the vfio_device APIs the driver facing group interface is no longer used, delete it:
- vfio_group_get_external_user_from_dev() - vfio_group_pin_pages() - vfio_group_unpin_pages() - vfio_group_iommu_domain()
Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- drivers/vfio/vfio.c | 151 ------------------------------------------- include/linux/vfio.h | 11 ---- 2 files changed, 162 deletions(-)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index e6e102e017623b..3d75505bf3cc26 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -1947,44 +1947,6 @@ struct vfio_group *vfio_group_get_external_user(struct file *filep) } EXPORT_SYMBOL_GPL(vfio_group_get_external_user);
-/* - * External user API, exported by symbols to be linked dynamically. - * The external user passes in a device pointer - * to verify that: - * - A VFIO group is assiciated with the device; - * - IOMMU is set for the group. - * If both checks passed, vfio_group_get_external_user_from_dev() - * increments the container user counter to prevent the VFIO group - * from disposal before external user exits and returns the pointer - * to the VFIO group. - * - * When the external user finishes using the VFIO group, it calls - * vfio_group_put_external_user() to release the VFIO group and - * decrement the container user counter. - * - * @dev [in] : device - * Return error PTR or pointer to VFIO group. - */ - -struct vfio_group *vfio_group_get_external_user_from_dev(struct device *dev) -{ - struct vfio_group *group; - int ret; - - group = vfio_group_get_from_dev(dev); - if (!group) - return ERR_PTR(-ENODEV); - - ret = vfio_group_add_container_user(group); - if (ret) { - vfio_group_put(group); - return ERR_PTR(ret); - } - - return group; -} -EXPORT_SYMBOL_GPL(vfio_group_get_external_user_from_dev); - void vfio_group_put_external_user(struct vfio_group *group) { vfio_group_try_dissolve_container(group); @@ -2218,101 +2180,6 @@ int vfio_unpin_pages(struct vfio_device *vdev, unsigned long *user_pfn, } EXPORT_SYMBOL(vfio_unpin_pages);
-/* - * Pin a set of guest IOVA PFNs and return their associated host PFNs for a - * VFIO group. - * - * The caller needs to call vfio_group_get_external_user() or - * vfio_group_get_external_user_from_dev() prior to calling this interface, - * so as to prevent the VFIO group from disposal in the middle of the call. - * But it can keep the reference to the VFIO group for several calls into - * this interface. - * After finishing using of the VFIO group, the caller needs to release the - * VFIO group by calling vfio_group_put_external_user(). - * - * @group [in] : VFIO group - * @user_iova_pfn [in] : array of user/guest IOVA PFNs to be pinned. - * @npage [in] : count of elements in user_iova_pfn array. - * This count should not be greater - * VFIO_PIN_PAGES_MAX_ENTRIES. - * @prot [in] : protection flags - * @phys_pfn [out] : array of host PFNs - * Return error or number of pages pinned. - */ -int vfio_group_pin_pages(struct vfio_group *group, - unsigned long *user_iova_pfn, int npage, - int prot, unsigned long *phys_pfn) -{ - struct vfio_container *container; - struct vfio_iommu_driver *driver; - int ret; - - if (!group || !user_iova_pfn || !phys_pfn || !npage) - return -EINVAL; - - if (group->dev_counter > 1) - return -EINVAL; - - if (npage > VFIO_PIN_PAGES_MAX_ENTRIES) - return -E2BIG; - - container = group->container; - driver = container->iommu_driver; - if (likely(driver && driver->ops->pin_pages)) - ret = driver->ops->pin_pages(container->iommu_data, - group->iommu_group, user_iova_pfn, - npage, prot, phys_pfn); - else - ret = -ENOTTY; - - return ret; -} -EXPORT_SYMBOL(vfio_group_pin_pages); - -/* - * Unpin a set of guest IOVA PFNs for a VFIO group. - * - * The caller needs to call vfio_group_get_external_user() or - * vfio_group_get_external_user_from_dev() prior to calling this interface, - * so as to prevent the VFIO group from disposal in the middle of the call. - * But it can keep the reference to the VFIO group for several calls into - * this interface. - * After finishing using of the VFIO group, the caller needs to release the - * VFIO group by calling vfio_group_put_external_user(). - * - * @group [in] : vfio group - * @user_iova_pfn [in] : array of user/guest IOVA PFNs to be unpinned. - * @npage [in] : count of elements in user_iova_pfn array. - * This count should not be greater than - * VFIO_PIN_PAGES_MAX_ENTRIES. - * Return error or number of pages unpinned. - */ -int vfio_group_unpin_pages(struct vfio_group *group, - unsigned long *user_iova_pfn, int npage) -{ - struct vfio_container *container; - struct vfio_iommu_driver *driver; - int ret; - - if (!group || !user_iova_pfn || !npage) - return -EINVAL; - - if (npage > VFIO_PIN_PAGES_MAX_ENTRIES) - return -E2BIG; - - container = group->container; - driver = container->iommu_driver; - if (likely(driver && driver->ops->unpin_pages)) - ret = driver->ops->unpin_pages(container->iommu_data, - user_iova_pfn, npage); - else - ret = -ENOTTY; - - return ret; -} -EXPORT_SYMBOL(vfio_group_unpin_pages); - - /* * This interface allows the CPUs to perform some sort of virtual DMA on * behalf of the device. @@ -2515,24 +2382,6 @@ int vfio_unregister_notifier(struct vfio_device *dev, } EXPORT_SYMBOL(vfio_unregister_notifier);
-struct iommu_domain *vfio_group_iommu_domain(struct vfio_group *group) -{ - struct vfio_container *container; - struct vfio_iommu_driver *driver; - - if (!group) - return ERR_PTR(-EINVAL); - - container = group->container; - driver = container->iommu_driver; - if (likely(driver && driver->ops->group_iommu_domain)) - return driver->ops->group_iommu_domain(container->iommu_data, - group->iommu_group); - - return ERR_PTR(-ENOTTY); -} -EXPORT_SYMBOL_GPL(vfio_group_iommu_domain); - /* * Module/class support */ diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 91d46e532ca104..9a9981c2622896 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -140,8 +140,6 @@ int vfio_mig_get_next_state(struct vfio_device *device, */ extern struct vfio_group *vfio_group_get_external_user(struct file *filep); extern void vfio_group_put_external_user(struct vfio_group *group); -extern struct vfio_group *vfio_group_get_external_user_from_dev(struct device - *dev); extern bool vfio_external_group_match_file(struct vfio_group *group, struct file *filep); extern int vfio_external_user_iommu_id(struct vfio_group *group); @@ -154,18 +152,9 @@ extern int vfio_pin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage, int prot, unsigned long *phys_pfn); extern int vfio_unpin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage); - -extern int vfio_group_pin_pages(struct vfio_group *group, - unsigned long *user_iova_pfn, int npage, - int prot, unsigned long *phys_pfn); -extern int vfio_group_unpin_pages(struct vfio_group *group, - unsigned long *user_iova_pfn, int npage); - extern int vfio_dma_rw(struct vfio_device *vdev, dma_addr_t user_iova, void *data, size_t len, bool write);
-extern struct iommu_domain *vfio_group_iommu_domain(struct vfio_group *group); - /* each type has independent events */ enum vfio_notify_type { VFIO_IOMMU_NOTIFY = 0,
When the open_device() op is called the container_users is incremented and held incremented until close_device(). Thus, so long as drivers call functions within their open_device()/close_device() region they do not need to worry about the container_users.
These functions can all only be called between open_device()/close_device():
vfio_pin_pages() vfio_unpin_pages() vfio_dma_rw() vfio_register_notifier() vfio_unregister_notifier()
So eliminate the calls to vfio_group_add_container_user() and add a simple WARN_ON to detect mis-use by drivers.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- drivers/vfio/vfio.c | 67 +++++++++------------------------------------ 1 file changed, 13 insertions(+), 54 deletions(-)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 3d75505bf3cc26..ab0c3f5635905c 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -2121,9 +2121,8 @@ int vfio_pin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage, if (group->dev_counter > 1) return -EINVAL;
- ret = vfio_group_add_container_user(group); - if (ret) - return ret; + if (WARN_ON(!READ_ONCE(vdev->open_count))) + return -EINVAL;
container = group->container; driver = container->iommu_driver; @@ -2134,8 +2133,6 @@ int vfio_pin_pages(struct vfio_device *vdev, unsigned long *user_pfn, int npage, else ret = -ENOTTY;
- vfio_group_try_dissolve_container(group); - return ret; } EXPORT_SYMBOL(vfio_pin_pages); @@ -2162,9 +2159,8 @@ int vfio_unpin_pages(struct vfio_device *vdev, unsigned long *user_pfn, if (npage > VFIO_PIN_PAGES_MAX_ENTRIES) return -E2BIG;
- ret = vfio_group_add_container_user(vdev->group); - if (ret) - return ret; + if (WARN_ON(!READ_ONCE(vdev->open_count))) + return -EINVAL;
container = vdev->group->container; driver = container->iommu_driver; @@ -2174,8 +2170,6 @@ int vfio_unpin_pages(struct vfio_device *vdev, unsigned long *user_pfn, else ret = -ENOTTY;
- vfio_group_try_dissolve_container(vdev->group); - return ret; } EXPORT_SYMBOL(vfio_unpin_pages); @@ -2207,9 +2201,8 @@ int vfio_dma_rw(struct vfio_device *vdev, dma_addr_t user_iova, if (!data || len <= 0) return -EINVAL;
- ret = vfio_group_add_container_user(vdev->group); - if (ret) - return ret; + if (WARN_ON(!READ_ONCE(vdev->open_count))) + return -EINVAL;
container = vdev->group->container; driver = container->iommu_driver; @@ -2219,9 +2212,6 @@ int vfio_dma_rw(struct vfio_device *vdev, dma_addr_t user_iova, user_iova, data, len, write); else ret = -ENOTTY; - - vfio_group_try_dissolve_container(vdev->group); - return ret; } EXPORT_SYMBOL(vfio_dma_rw); @@ -2234,10 +2224,6 @@ static int vfio_register_iommu_notifier(struct vfio_group *group, struct vfio_iommu_driver *driver; int ret;
- ret = vfio_group_add_container_user(group); - if (ret) - return -EINVAL; - container = group->container; driver = container->iommu_driver; if (likely(driver && driver->ops->register_notifier)) @@ -2245,9 +2231,6 @@ static int vfio_register_iommu_notifier(struct vfio_group *group, events, nb); else ret = -ENOTTY; - - vfio_group_try_dissolve_container(group); - return ret; }
@@ -2258,10 +2241,6 @@ static int vfio_unregister_iommu_notifier(struct vfio_group *group, struct vfio_iommu_driver *driver; int ret;
- ret = vfio_group_add_container_user(group); - if (ret) - return -EINVAL; - container = group->container; driver = container->iommu_driver; if (likely(driver && driver->ops->unregister_notifier)) @@ -2269,9 +2248,6 @@ static int vfio_unregister_iommu_notifier(struct vfio_group *group, nb); else ret = -ENOTTY; - - vfio_group_try_dissolve_container(group); - return ret; }
@@ -2300,10 +2276,6 @@ static int vfio_register_group_notifier(struct vfio_group *group, if (*events) return -EINVAL;
- ret = vfio_group_add_container_user(group); - if (ret) - return -EINVAL; - ret = blocking_notifier_chain_register(&group->notifier, nb);
/* @@ -2313,25 +2285,6 @@ static int vfio_register_group_notifier(struct vfio_group *group, if (!ret && set_kvm && group->kvm) blocking_notifier_call_chain(&group->notifier, VFIO_GROUP_NOTIFY_SET_KVM, group->kvm); - - vfio_group_try_dissolve_container(group); - - return ret; -} - -static int vfio_unregister_group_notifier(struct vfio_group *group, - struct notifier_block *nb) -{ - int ret; - - ret = vfio_group_add_container_user(group); - if (ret) - return -EINVAL; - - ret = blocking_notifier_chain_unregister(&group->notifier, nb); - - vfio_group_try_dissolve_container(group); - return ret; }
@@ -2344,6 +2297,9 @@ int vfio_register_notifier(struct vfio_device *dev, enum vfio_notify_type type, if (!nb || !events || (*events == 0)) return -EINVAL;
+ if (WARN_ON(!READ_ONCE(dev->open_count))) + return -EINVAL; + switch (type) { case VFIO_IOMMU_NOTIFY: ret = vfio_register_iommu_notifier(group, events, nb); @@ -2368,12 +2324,15 @@ int vfio_unregister_notifier(struct vfio_device *dev, if (!nb) return -EINVAL;
+ if (WARN_ON(!READ_ONCE(dev->open_count))) + return -EINVAL; + switch (type) { case VFIO_IOMMU_NOTIFY: ret = vfio_unregister_iommu_notifier(group, nb); break; case VFIO_GROUP_NOTIFY: - ret = vfio_unregister_group_notifier(group, nb); + ret = blocking_notifier_chain_unregister(&group->notifier, nb); break; default: ret = -EINVAL;
On 4/12/22 11:53 AM, Jason Gunthorpe wrote:
When the open_device() op is called the container_users is incremented and held incremented until close_device(). Thus, so long as drivers call functions within their open_device()/close_device() region they do not need to worry about the container_users.
These functions can all only be called between open_device()/close_device():
vfio_pin_pages() vfio_unpin_pages() vfio_dma_rw() vfio_register_notifier() vfio_unregister_notifier()
So eliminate the calls to vfio_group_add_container_user() and add a simple WARN_ON to detect mis-use by drivers.
vfio_device_fops_release decrements dev->open_count immediately before calling dev->ops->close_device, which means we could enter close_device with a dev_count of 0.
Maybe vfio_device_fops_release should handle the same way as vfio_group_get_device_fd?
if (device->open_count == 1 && device->ops->close_device) device->ops->close_device(device); device->open_count--;
On Thu, Apr 14, 2022 at 09:51:49AM -0400, Matthew Rosato wrote:
On 4/12/22 11:53 AM, Jason Gunthorpe wrote:
When the open_device() op is called the container_users is incremented and held incremented until close_device(). Thus, so long as drivers call functions within their open_device()/close_device() region they do not need to worry about the container_users.
These functions can all only be called between open_device()/close_device():
vfio_pin_pages() vfio_unpin_pages() vfio_dma_rw() vfio_register_notifier() vfio_unregister_notifier()
So eliminate the calls to vfio_group_add_container_user() and add a simple WARN_ON to detect mis-use by drivers.
vfio_device_fops_release decrements dev->open_count immediately before calling dev->ops->close_device, which means we could enter close_device with a dev_count of 0.
Maybe vfio_device_fops_release should handle the same way as vfio_group_get_device_fd?
if (device->open_count == 1 && device->ops->close_device) device->ops->close_device(device); device->open_count--;
Yes, thanks alot! I have nothing to test these flows on...
It matches the ordering in the only other place to call close_device.
I folded this into the patch:
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 0f735f9f206002..29761f0cf0a227 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -1551,8 +1551,9 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
mutex_lock(&device->dev_set->lock); vfio_assert_device_open(device); - if (!--device->open_count && device->ops->close_device) + if (device->open_count == 1 && device->ops->close_device) device->ops->close_device(device); + device->open_count--; mutex_unlock(&device->dev_set->lock);
module_put(device->dev->driver->owner);
Jason
From: Jason Gunthorpe jgg@nvidia.com Sent: Thursday, April 14, 2022 10:22 PM
On Thu, Apr 14, 2022 at 09:51:49AM -0400, Matthew Rosato wrote:
On 4/12/22 11:53 AM, Jason Gunthorpe wrote:
When the open_device() op is called the container_users is incremented
and
held incremented until close_device(). Thus, so long as drivers call functions within their open_device()/close_device() region they do not need to worry about the container_users.
These functions can all only be called between open_device()/close_device():
vfio_pin_pages() vfio_unpin_pages() vfio_dma_rw() vfio_register_notifier() vfio_unregister_notifier()
So eliminate the calls to vfio_group_add_container_user() and add a
simple
WARN_ON to detect mis-use by drivers.
vfio_device_fops_release decrements dev->open_count immediately
before
calling dev->ops->close_device, which means we could enter close_device
with
a dev_count of 0.
Maybe vfio_device_fops_release should handle the same way as vfio_group_get_device_fd?
if (device->open_count == 1 && device->ops->close_device) device->ops->close_device(device); device->open_count--;
Yes, thanks alot! I have nothing to test these flows on...
It matches the ordering in the only other place to call close_device.
I folded this into the patch:
While it's a welcomed fix is it actually related to this series? The point of this patch is that those functions are called when container_users is non-zero. This is true even without this fix given container_users is decremented after calling device->ops->close_device().
iiuc this might be better sent out as a separate fix out of this series? Or at least add a comment in the commit msg about taking chance to fix an unrelated issue to not cause confusion...
Thanks Kevin
On Fri, Apr 15, 2022 at 02:32:08AM +0000, Tian, Kevin wrote:
While it's a welcomed fix is it actually related to this series? The point of this patch is that those functions are called when container_users is non-zero. This is true even without this fix given container_users is decremented after calling device->ops->close_device().
It isn't, it is decremented before which causes it to be 0 when the assertions are called.
Jason
From: Jason Gunthorpe jgg@nvidia.com Sent: Friday, April 15, 2022 8:07 PM
On Fri, Apr 15, 2022 at 02:32:08AM +0000, Tian, Kevin wrote:
While it's a welcomed fix is it actually related to this series? The point of this patch is that those functions are called when container_users is non-zero. This is true even without this fix given container_users is decremented after calling device->ops->close_device().
It isn't, it is decremented before which causes it to be 0 when the assertions are called.
right, it's quite obvious when I read it the second time.
dri-devel@lists.freedesktop.org