On 2019-01-31 12:02 p.m., Jason Gunthorpe wrote:
I still think the right direction is to build on what Logan has done - realize that he created a DMA-only SGL - make that a formal type of the kernel and provide the right set of APIs to work with this type, without being forced to expose struct page.
Basically invert the API flow - the DMA map would be done close to GUP, not buried in the driver. This absolutely doesn't work for every flow we have, but it does enable the ones that people seem to care about when talking about P2P. It also does present a path to solve some cases of the O_DIRECT problems if the block stack can develop some way to know if an IO will go down a DMA-only IO path or not... This seems less challenging that auditing every SGL user for iomem safety??
The DMA-only SGL will work for some use cases, but I think it's going to be a challenge for others. We care most about NVMe and, therefore, the block layer.
Given my understanding of the block layer, and it's queuing infrastructure, I don't think having a DMA-only IO path makes sense. I think it has to be the same path, but with a special DMA-only bio; and endpoints would have to indicate support for that bio. I can't say I have a deep enough understanding of the block layer to know how possible that would be.
Logan