On Thu, Feb 04, 2021 at 11:00:32AM -0800, John Hubbard wrote:
On 2/4/21 10:44 AM, Alex Deucher wrote: ...
The argument is that vram is a scarce resource, but I don't know if that is really the case these days. At this point, we often have as much vram as system ram if not more.
I thought the main argument was that GPU memory could move at any time between the GPU and CPU and the DMA buf would always track its current location?
I think the reason for that is that VRAM is scarce so we have to be able to move it around. We don't enforce the same limitations for buffers in system memory. We could just support pinning dma-bufs in vram like we do with system ram. Maybe with some conditions, e.g., p2p is possible, and the device has a large BAR so you aren't tying up the BAR window.
Minimally we need cgroups for that vram, so it can be managed. Which is a bit stuck unfortunately. But if we have cgroups with some pin limit, I think we can easily lift this.
Excellent. And yes, we are already building systems in which VRAM is definitely not scarce, but on the other hand, those newer systems can also handle GPU (and NIC) page faults, so not really an issue. For that, we just need to enhance HMM so that it does peer to peer.
We also have some older hardware with large BAR1 apertures, specifically for this sort of thing.
And again, for slightly older hardware, without pinning to VRAM there is no way to use this solution here for peer-to-peer. So I'm glad to see that so far you're not ruling out the pinning option.
Since HMM and ZONE_DEVICE came up, I'm kinda tempted to make ZONE_DEVICE ZONE_MOVEABLE (at least if you don't have a pinned vram contigent in your cgroups) or something like that, so we could benefit from the work to make sure pin_user_pages and all these never end up in there?
https://lwn.net/Articles/843326/
Kind inspired by the recent lwn article. -Daniel