On Fri, Jul 23, 2021 at 10:05:43AM +0200, Christoph Hellwig wrote:
On Tue, Jul 20, 2021 at 02:42:55PM -0300, Jason Gunthorpe wrote:
Keep track of all the vfio_devices that have been added to the device set and use this list in vfio_pci_try_bus_reset() instead of trying to work backwards from the pci_device.
The dev_set->lock directly prevents devices from joining/leaving the set, which further implies the pci_device cannot change drivers or that the vfio_device be freed, eliminating the need for get/put's.
Completeness of the device set can be directly measured by checking if every PCI device in the reset group is also in the device set - which proves that VFIO drivers are attached to everything.
This restructuring corrects a call to pci_dev_driver() without holding the device_lock() and removes a hard wiring to &vfio_pci_driver.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com
I think the addition of the list to the dev_set should be a different patch. Or maybe even go into the one adding the dev_set concept.
OK
+static int vfio_pci_check_all_devices_bound(struct pci_dev *pdev, void *data) {
struct vfio_device_set *dev_set = data;
struct vfio_device *cur;
lockdep_assert_held(&dev_set->lock);
list_for_each_entry(cur, &dev_set->device_list, dev_set_list)
if (cur->dev == &pdev->dev)
return 0;
return -EBUSY;
I don't understand this logic. If there is any device in the set that does now have the same struct device we're in trouble? Please clearly document what this is trying to do. If the bound in the name makes sense you probably want to check the driver instead.
The PCI reset this code is tring to do effects a set of PCI devices, due to how the HW works.
Along with the vfio_pci_for_each_slot_or_bus() this is computing a set wise 'is superset' between the list of pci_dev's the reset will affect (the reset group) and the list of vfio_devices that we have locking control over to sequence the reset (the dev_set).
If every PCI device we will reset is under the dev_set then we directly know it is safe to trigger the reset. If any PCI device is not in this dev_set then we cannot use the reset as we can't know what will happen to the device that we don't control.
Let's use a different word than bound? vfio_pci_check_device_in_set()?
static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev) {
/* All VFIO devices have a closed FD */
list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list)
if (cur->vdev.open_count)
return;
/* All devices in the group to be reset need VFIO devices */
if (vfio_pci_for_each_slot_or_bus(
vdev->pdev, vfio_pci_check_all_devices_bound, dev_set,
!pci_probe_reset_slot(vdev->pdev->slot)))
return;
/* Does at least one need a reset? */
These checks look a little strange, and the comments don't make much sense. What about an incremental patch like this?
Sure, it can go in a function
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index fbc20f6d2dd412..d8375a5e77e07c 100644 +++ b/drivers/vfio/pci/vfio_pci.c @@ -2188,10 +2188,34 @@ static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data) return 0; }
+static struct pci_dev *vfio_pci_reset_target(struct vfio_pci_device *vdev) +{
- struct vfio_device_set *dev_set = vdev->vdev.dev_set;
- struct vfio_pci_device *cur;
- /* none of the device is allowed to be currently open: */
- list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list)
if (cur->vdev.open_count)
return NULL;
- /* all devices in the group to be reset need to be VFIO devices: */
It is not "need to be VFIO devices" it is "need to be in our dev_set". Have the PCI dev bound to, say, a mdev VFIO device isn't good enough.
Thanks, Jason
dri-devel@lists.freedesktop.org