On Thu, Sep 09 2021, Jason Gunthorpe jgg@nvidia.com wrote:
The subchannel should be left in a quiescent state unless the VFIO device FD is opened. When the FD is opened bring the chanel to active and allow the VFIO device to operate. When the device FD is closed then quiesce the channel.
To make this work the FSM needs to handle the transitions to/from open and closed so everything is sequenced. Rename state NOT_OPER to BROKEN and use it wheneven the driver has malfunctioned. STANDBY becomes CLOSED. The normal case FSM looks like: CLOSED -> IDLE -> PROCESS/PENDING* -> IDLE -> CLOSED
With a possible branch off to BROKEN from any state. Once the device is in BROKEN it cannot be recovered other than be reloading the driver.
Hm, not sure whether it is a good idea to conflate "something went wrong" and "device is not operational". In the latter case, we will eventually get a removal of the css device when the common I/O layer has processed the channel report for the subchannel; while the former case could mean all kind of things, but the subchannel will likely stay around. I think NOT_OPER was always meant to be a transitional state.
Delete the triply redundant calls to vfio_ccw_sch_quiesce(). vfio_ccw_mdev_close_device() always leaves the subchannel quiescent. vfio_ccw_mdev_remove() cannot return until vfio_ccw_mdev_close_device() completes and vfio_ccw_sch_remove() cannot return until vfio_ccw_mdev_remove() completes. Have the FSM code take care of calling cp_free() when appropriate.
I remember some serialization issues wrt cp_free() etc. coming up every now and than; that might need extra care (I'm taking a look.)
Device reset becomes a CLOSE/OPEN sequence which now properly handles the situation if the device becomes BROKEN.
Machine shutdown via vfio_ccw_sch_shutdown() now simply tries to close and leaves the device BROKEN (though arguably the bus should take care to quiet down the subchannel HW during shutdown, not the drivers)
The problem is that there is not really a uniform thing that can be done for shutdown; e.g. we can quiesce and then disable I/O and EADM subchannels, but CHSC subchannels cannot be quiesced.