Am 02.07.20 um 15:29 schrieb Jason Gunthorpe:
On Thu, Jul 02, 2020 at 03:10:00PM +0200, Daniel Vetter wrote:
On Wed, Jul 01, 2020 at 02:15:24PM -0300, Jason Gunthorpe wrote:
On Wed, Jul 01, 2020 at 05:42:21PM +0200, Daniel Vetter wrote:
> All you need is the ability to stop wait for ongoing accesses to end and > make sure that new ones grab a new mapping. Swap and flush isn't a general HW ability either..
I'm unclear how this could be useful, it is guarenteed to corrupt in-progress writes?
Did you mean pause, swap and resume? That's ODP.
Yes, something like this. And good to know, never heard of ODP.
Hm I thought ODP was full hw page faults at an individual page level,
Yes
and this stop&resume is for the entire nic. Under the hood both apply back-pressure on the network if a transmission can't be received, but
NIC's don't do stop and resume, blocking the Rx pipe is very problematic and performance destroying.
The strategy for something like ODP is more complex, and so far no NIC has deployed it at any granularity larger than per-page.
So since Jason really doesn't like dma_fence much I think for rdma synchronous it is. And it shouldn't really matter, since waiting for a small transaction to complete at rdma wire speed isn't really that long an operation.
Even if DMA fence were to somehow be involved, how would it look?
Well above you're saying it would be performance destroying, but let's pretend that's not a problem :-) Also, I have no clue about rdma, so this is really just the flow we have on the gpu side.
I see, no, this is not workable, the command flow in RDMA is not at all like GPU - what you are a proposing is a global 'stop the whole chip' Tx and Rx flows for an undetermined time. Not feasible
What we can do is use ODP techniques and pause only the MR attached to the DMA buf with the process you outline below. This is not so hard to implement.
Well it boils down to only two requirements:
1. You can stop accessing the memory or addresses exported by the DMA-buf.
2. Before the next access you need to acquire a new mapping.
How you do this is perfectly up to you. E.g. you can stop everything, just prevent access to this DMA-buf, or just pause the users of this DMA-buf....
- rdma driver worker gets busy to restart rx:
thanks to ww_mutex deadlock avoidance this is possible
- lock all dma-buf that are currently in use (dma_resv_lock).
Why all? Why not just lock the one that was invalidated to restore the mappings? That is some artifact of the GPU approach?
No, but you must make sure that mapping one doesn't invalidate others you need.
Otherwise you can end up in a nice live lock :)
And why is this done with work queues and locking instead of a callback saying the buffer is valid again?
You can do this as well, but a work queue is usually easier to handle than a notification in an interrupt context of a foreign driver.
Regards, Christian.
Jason