On 2/5/21 7:53 AM, Daniel Vetter wrote:
On Fri, Feb 05, 2021 at 11:43:19AM -0400, Jason Gunthorpe wrote:
On Fri, Feb 05, 2021 at 04:39:47PM +0100, Daniel Vetter wrote:
And again, for slightly older hardware, without pinning to VRAM there is no way to use this solution here for peer-to-peer. So I'm glad to see that so far you're not ruling out the pinning option.
Since HMM and ZONE_DEVICE came up, I'm kinda tempted to make ZONE_DEVICE ZONE_MOVEABLE (at least if you don't have a pinned vram contigent in your cgroups) or something like that, so we could benefit from the work to make sure pin_user_pages and all these never end up in there?
ZONE_DEVICE should already not be returned from GUP.
I've understood in the hmm casse the idea was a CPU touch of some ZONE_DEVICE pages would trigger a migration to CPU memory, GUP would want to follow the same logic, presumably it comes for free with the fault handler somehow
Oh I didn't know this, I thought the proposed p2p direct i/o patches would just use the fact that underneath ZONE_DEVICE there's "normal" struct pages. And so I got worried that maybe also pin_user_pages can creep in. But I didn't read the patches in full detail:
https://lore.kernel.org/linux-block/20201106170036.18713-12-logang@deltatee....
But if you're saying that this all needs specific code and all the gup/pup code we have is excluded, I think we can make sure that we're not ever building features that requiring time-unlimited pinning of ZONE_DEVICE. Which I think we want.
From an HMM perspective, the above sounds about right. HMM relies on the GPU/device memory being ZONE_DEVICE, *and* on that memory *not* being pinned. (HMM's mmu notifier callbacks act as a sort of virtual pin, but not a refcount pin.)
It's a nice clean design point that we need to preserve, and fortunately it doesn't conflict with anything I'm seeing here. But I want to say this out loud because I see some doubt about it creeping into the discussion.
thanks,