From: Jason Gunthorpe jgg@nvidia.com Sent: Thursday, April 14, 2022 10:22 PM
On Thu, Apr 14, 2022 at 09:51:49AM -0400, Matthew Rosato wrote:
On 4/12/22 11:53 AM, Jason Gunthorpe wrote:
When the open_device() op is called the container_users is incremented
and
held incremented until close_device(). Thus, so long as drivers call functions within their open_device()/close_device() region they do not need to worry about the container_users.
These functions can all only be called between open_device()/close_device():
vfio_pin_pages() vfio_unpin_pages() vfio_dma_rw() vfio_register_notifier() vfio_unregister_notifier()
So eliminate the calls to vfio_group_add_container_user() and add a
simple
WARN_ON to detect mis-use by drivers.
vfio_device_fops_release decrements dev->open_count immediately
before
calling dev->ops->close_device, which means we could enter close_device
with
a dev_count of 0.
Maybe vfio_device_fops_release should handle the same way as vfio_group_get_device_fd?
if (device->open_count == 1 && device->ops->close_device) device->ops->close_device(device); device->open_count--;
Yes, thanks alot! I have nothing to test these flows on...
It matches the ordering in the only other place to call close_device.
I folded this into the patch:
While it's a welcomed fix is it actually related to this series? The point of this patch is that those functions are called when container_users is non-zero. This is true even without this fix given container_users is decremented after calling device->ops->close_device().
iiuc this might be better sent out as a separate fix out of this series? Or at least add a comment in the commit msg about taking chance to fix an unrelated issue to not cause confusion...
Thanks Kevin